Toronto, September 16 2024
Autor : Atsu Vovor
Master of Management in Artificial Intelligence,
Consultant Data Analytics Specialist | Machine Learning | Data science | Quantitative Analysis |French & English Bilingual
This project presents the development of an advanced stock portfolio analytics tool designed to assist portfolio managers in optimizing investment strategies. By leveraging statistical analysis, mathematical and machine learning techniques, the tool provides insights into stock asset pricing, risk assessment, asset allocation, and performance forecasting. The project outlines the methodology used, including data collection and preprocessing, explanatory datanalysis, model selection and evaluation metrics, stress testing under economic key performance indicators scenarios. Results demonstrate the tool's effectiveness in enhancing decision-making processes, potentially leading to improved portfolio performance. The findings highlight the importance of integrating modern analytics into traditional portfolio management to navigate the complexities of today's financial markets.
The growing complexity of financial instruments and risk factors places significant pressure on portfolio managers, who must navigate and analyze a vast and intricate flow of data each day. Utilizing a robust dataset comprising historical stock prices, economic indicators, and financial metrics, our goal is to develop an advanced stock portfolio analysis tool that leverages advanced statistical methods, portfolio optimization and machine learning techniques to assist portfolio managers in making informed decisions. The tool provides insights into the asset pricing, risk assessment, asset allocation, and performance forecasting.
To achieve this goal, we begin by dynamically collecting real time data of all the S&P/TSX composite constituents adjust closed prices and canadian economic factors. The methodology used involves data preprocessing to ensure accuracy and relevance, followed by exploratory data analysis (EDA) to uncover key trends and correlations. Principal Component Analysis (PCA) is applied to reduce the dimensionality of the dataset, enabling the identification of the most influential factors affecting portfolio performance. We then use correlation analysis and hierarchical clustering to categorize stocks into distinct groups, facilitating diversification and risk management.
Moreover, the project explores advenced assets pricing technics sach as Stochastic Differencial Equation and Monte Carlo Simulation combined with modern portfolio theory (MPT) to simulate the portfolio price, profit & lost, risk and construct efficient portfolios, and stress testing techniques to evaluate portfolio robustness under various economic scenarios. The results demonstrate significant improvements in risk-adjusted returns, providing actionable insights for portfolio managers and investors.
In conclusion, this project underscores the importance of integrating advanced analytics into investment decision-making processes. The findings offer a valuable framework for optimizing stock portfolios, enhancing performance, and managing risk in an increasingly complex financial environment.
This Project presents an in-depth analysis of stock portfolio management through the application of advanced data analytics techniques. The project aims to address the challenges faced by investors in optimizing their portfolios by incorporating a data-driven approach to decision-making. By analyzing historical stock prices, financial indicators, and macroeconomic variables, the project seeks to develop strategies that maximize returns while minimizing risk.
Scope of the Project
The scope of this project includes the following key areas:
1. Data Collection and Preprocessing:
2. Exploratory Data Analysis (EDA):
3. Dimensionality Reduction and Portfolio Construction using Correlation Analysis, Clustering and Principal Component Analysis (PCA)
Correlation Analysis, Clustering and Portfolio Construction Hierarchical clustering techniques are applied to group stocks into clusters based on their similarities in performance, risk profile, and other attributes. This clustering facilitates the selection of a diversified set of assets for portfolio construction, ensuring that the portfolio is balanced and less susceptible to market shocks.
Principal Component Analysis (PCA) To manage the complexity of the dataset and to focus on the most impactful variables, PCA is utilized to reduce the number of factors considered in the analysis.It helps in identifying the principal components that explain the majority of the variance in the data, enabling the selection of the most relevant indicators for portfolio construction.
Stacking PCA,Correlation Analysis and Clustering for Diversified Portfolio Construction stacking Correlation Analysis, Clustering and Principal Component Analysis (PCA) helps to construct a well diversified portfolio
4. Asset Pricing, Profit & Lost simulation and Risk calculation*
5. Portfolio Optimization:
6. Investment Risk Profiles Simulation using K-Means Clustering applied to random portfolio
7. Stress Testing and Scenario Analysis:
Tools and Technologies
The project leverages various tools and technologies, including:
Key Outcomes The project yields several key outcomes:
#pip install stats-can
#conda pip install -c districtdatalabs yellowbrick on Anaconda Prompt
#conda install conda=24.5.0
#conda install conda-forge::stats_can
import yfinance as yf
import pandas as pd
from datetime import date, timedelta
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from scipy.stats import norm, lognorm, exponnorm, logistic, erlang,gennorm
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
#from sklearn.metrics import root_mean_squared_error
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_log_error
from sklearn.metrics import mean_absolute_percentage_error
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
#from yellowbrick.cluster import KElbowVisualizer
from scipy.optimize import curve_fit
import random
from statistics import NormalDist
from scipy import stats
from fitter import Fitter, get_common_distributions, get_distributions
import matplotlib.transforms as transforms
from matplotlib.table import table
from numpy import arange
from pandas import read_csv
from scipy.optimize import curve_fit
import warnings
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from sklearn.model_selection import train_test_split, cross_val_score
from tabulate import tabulate
from pandas.plotting import lag_plot
import re
#from stats_can import StatsCan as sc
from stats_can import StatsCan
sc = StatsCan()
from scipy.cluster.hierarchy import fcluster
#import pandas_datareader.data as web
In this section, we will read all the S&P/TSX composite constituents table from wikipedia(https://en.wikipedia.org/wiki/S%26P/TSX_Composite_Index). Then we will get the tickers adjusted close prices from Yahoo Finance using yfinance library. We will clean the data by removing all the empty rows and columns. With more than 200 remaining tickers, we will calculate the assets log return and we will remove all the assets with negative expected return. We will couple Correlation Analysis with Principle Component Analysis to reduce the volume of assets and keep only most important assets. The Correlation Analysis will be used to identify and remove redundant assets. The end result will be a well diversify portfolio.
#--------------------------------------------- 1. Index Contents Data Collection and Preprocessing ------------------------------------------------
#read the index content from wikipedia and return the index content data frame
def read_index_content(content_html,web_tab_number):
S_and_P_TSX_Composite = pd.read_html(content_html)[web_tab_number]
index_content_df = S_and_P_TSX_Composite[['Ticker','Company','Sector [10]','Industry [10]']]
index_content_df = index_content_df.rename(columns={"Sector [10]": "Sector", "Industry [10]": "Industry"})
return index_content_df
#return S_and_P_TSX_Composite[['Ticker','Company','Sector [10]','Industry [10]']].head()
#extract the index tickers
def generate_ticker_df(index_content_df):
index_content_tickers_list = index_content_df['Ticker']
index_content_tickers_list = index_content_tickers_list.tolist()
new_index_content_tickers_list = []
for item in index_content_tickers_list:
new_index_content_tickers_list.append(str(item))
return new_index_content_tickers_list
#--------------------------------------------------------------------------------------------------------------------
#Description:Extract adj close price for each stock on the index from Yahoo Finance web site and clean the data
#Input:start date, end date, index ticker list
#Return the index Adj close price data frame
#-----------------------------------------------------------------------------------------------------------------------
def start_date(reporting_year_period = 365*5):
return pd.Timestamp.today() - pd.Timedelta(days = reporting_year_period)
def create_adj_close_price_df(reporting_year_period, content_ticker_list):
start_date = reporting_year_period
end_date = date.today()
selected_assets_yahoo_adj_close_price_data = yf.download(content_ticker_list, start_date, end_date, ['Adj Close'], period ='max')
selected_assets_adj_close_price_df = selected_assets_yahoo_adj_close_price_data['Adj Close']
index_adj_close_price_df = selected_assets_adj_close_price_df.dropna(axis=1)
return index_adj_close_price_df
def asset_daily_price(price_df,number_of_asset):
print('\nPlotting the first 5 assets daily adj closed prices\n')
price_df.iloc[:,:number_of_asset].plot(figsize=(15,6))
plt.show()
print('\nData collection and preprocessing\n')
index_content_df = read_index_content('https://en.wikipedia.org/wiki/S%26P/TSX_Composite_Index',3)
content_ticker_list = generate_ticker_df(index_content_df)
index_adj_close_price_df = create_adj_close_price_df( start_date(365*5), content_ticker_list )
print('\nList of companies\n')
display(index_content_df)
print('\nAdjusted Close Price Data Frame\n')
display(index_adj_close_price_df)
print('\nData structure\n')
index_adj_close_price_df.info()
print('\nData statics summary\n')
display(index_adj_close_price_df.describe().transpose())
Data collection and preprocessing [*********************100%%**********************] 225 of 225 completed
105 Failed downloads:
['REI.UN', 'FRU', 'EMA', 'WN', 'NWC', 'DML', 'TOU', 'OLA', 'WDO', 'FIL', 'CFP', 'KEL', 'NPI', 'FFH', 'BIR', 'POU', 'AOI', 'INE', 'KNT', 'SRU.UN', 'CU', 'WTE', 'LUN', 'WSP', 'MTY', 'RUS', 'EFN', 'MRU', 'MATR', 'DFY', 'EIF', 'TIH', 'LNR', 'GWO', 'ATD', 'EQB', 'TOY', 'IFP', 'CCL.B', 'DSG', 'CJT', 'BBD.B', 'ARX', 'WPK', 'IMG', 'ABX', 'FVI', 'IFC', 'RCH', 'KXS', 'IVN', 'LUG', 'AAV', 'ALA', 'SIA', 'CPX', 'GEI', 'WCP']: Exception('%ticker%: No price data found, symbol may be delisted (1d 2019-09-18 15:57:34.705069 -> 2024-09-16)')
['CPG', 'ERF', 'ENGH', 'TCN']: Exception('%ticker%: No data found, symbol may be delisted')
['HR.UN', 'IPCO', 'BIP.UN', 'CCA', 'FCR.UN', 'PKI', 'PMZ.UN', 'NWH.UN', 'CSU', 'TECK.B', 'CS', 'EMP.A', 'BEI.UN', 'BEP.UN', 'QBR.B', 'CRT.UN', 'ATRL', 'ATH', 'IIP.UN', 'AP.UN', 'FTT', 'CNR', 'CRR.UN', 'KMP.UN', 'GRT.UN', 'DPM', 'CAR.UN', 'POW', 'TCL.A', 'DIR.UN', 'CHP.UN', 'MTL', 'TSU', 'GIB.A', 'RCI.B', 'TA', 'BDGI', 'BBU.UN', 'CSH.UN', 'ONEX', 'ACO.X', 'CTC.A']: Exception('%ticker%: No timezone found, symbol may be delisted')
['HWX']: Exception("%ticker%: Period 'max' is invalid, must be one of ['1d', '5d']")
List of companies
| Ticker | Company | Sector | Industry | |
|---|---|---|---|---|
| 0 | AAV | Advantage Energy Ltd. | Energy | Oil & Gas Exploration and Production |
| 1 | AOI | Africa Oil Corp. | Energy | Oil & Gas Exploration and Production |
| 2 | AEM | Agnico Eagle Mines Limited | Basic Materials | Metals & Mining |
| 3 | AC | Air Canada | Industrials | Transportation |
| 4 | AGI | Alamos Gold Inc. | Basic Materials | Metals & Mining |
| ... | ... | ... | ... | ... |
| 220 | WTE | Westshore Terminals Investment Corporation | Industrials | Transportation |
| 221 | WPM | Wheaton Precious Metals Corp. | Basic Materials | Metals & Mining |
| 222 | WCP | Whitecap Resources Inc. | Energy | Oil & Gas Exploration and Production |
| 223 | WPK | Winpak Ltd. | Consumer Cyclical | Packaging & Containers |
| 224 | WSP | WSP Global Inc. | Industrials | Construction |
225 rows × 4 columns
Adjusted Close Price Data Frame
| AC | AEM | AGI | AQN | ATS | BB | BCE | BHC | BLDP | BLX | ... | TPZ | TRI | TRP | TVE | TXG | VET | WCN | WFG | WPM | X | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||||||||||||||
| 2019-09-18 | 35.345322 | 50.151981 | 5.939981 | 10.246118 | 13.800000 | 7.52 | 36.354038 | 23.190001 | 5.47 | 14.192307 | ... | 11.939594 | 59.539158 | 37.488026 | 22.429239 | 62.000000 | 14.717662 | 86.776352 | 39.590134 | 25.364801 | 12.059963 |
| 2019-09-19 | 35.645519 | 50.451710 | 6.111328 | 10.291792 | 13.800000 | 7.57 | 36.248428 | 23.150000 | 5.24 | 14.113462 | ... | 11.939594 | 59.698231 | 37.650688 | 22.429239 | 61.119999 | 14.820879 | 87.075195 | 39.590134 | 25.617979 | 10.713511 |
| 2019-09-20 | 34.851452 | 51.306839 | 6.206520 | 10.352691 | 13.710000 | 7.54 | 36.489826 | 22.860001 | 5.37 | 14.070457 | ... | 11.995965 | 58.991207 | 38.227425 | 22.552620 | 60.500000 | 15.001518 | 87.364365 | 39.590134 | 25.692993 | 10.471342 |
| 2019-09-23 | 34.899876 | 52.470490 | 6.330270 | 10.337466 | 13.710000 | 7.51 | 36.429478 | 22.559999 | 5.56 | 14.034618 | ... | 11.982703 | 59.194481 | 38.456642 | 22.420422 | 60.330002 | 15.010121 | 87.711403 | 38.624985 | 26.340006 | 10.694136 |
| 2019-09-24 | 34.599686 | 52.805489 | 6.387385 | 10.512550 | 13.670000 | 5.81 | 36.678429 | 22.020000 | 5.44 | 14.063291 | ... | 11.916386 | 59.530315 | 38.360523 | 22.402796 | 54.299999 | 14.588629 | 88.010254 | 38.803024 | 26.668209 | 10.374474 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 32.720001 | 77.760002 | 18.135992 | 5.280000 | 26.260000 | 2.35 | 36.080002 | 6.200000 | 1.69 | 31.059999 | ... | 18.049999 | 168.600006 | 47.070000 | 22.520000 | 21.820000 | 9.040000 | 185.000000 | 87.669998 | 58.560001 | 32.820000 |
| 2024-09-10 | 32.880001 | 78.910004 | 18.675280 | 5.310000 | 26.340000 | 2.38 | 35.299999 | 6.250000 | 1.72 | 30.760000 | ... | 18.049999 | 171.449997 | 45.790001 | 22.480000 | 21.660000 | 8.940000 | 184.679993 | 87.489998 | 59.410000 | 31.219999 |
| 2024-09-11 | 32.799999 | 79.099998 | 18.885000 | 5.350000 | 26.510000 | 2.45 | 35.189999 | 6.380000 | 1.75 | 29.980000 | ... | 17.940001 | 172.130005 | 45.880001 | 22.520000 | 22.100000 | 9.150000 | 185.389999 | 86.760002 | 59.270000 | 33.389999 |
| 2024-09-12 | 33.000000 | 81.860001 | 20.059999 | 5.390000 | 26.200001 | 2.47 | 35.259998 | 6.300000 | 1.72 | 30.200001 | ... | 18.100000 | 173.759995 | 46.090000 | 22.520000 | 22.549999 | 9.200000 | 185.990005 | 88.260002 | 61.369999 | 34.740002 |
| 2024-09-13 | 33.220001 | 83.169998 | 20.690001 | 5.500000 | 25.709999 | 2.48 | 35.400002 | 6.320000 | 1.80 | 30.740000 | ... | 18.200001 | 172.699997 | 46.549999 | 22.510000 | 22.370001 | 9.220000 | 185.679993 | 90.510002 | 62.560001 | 36.070000 |
1256 rows × 99 columns
Data structure <class 'pandas.core.frame.DataFrame'> Index: 1256 entries, 2019-09-18 00:00:00 to 2024-09-13 00:00:00 Data columns (total 99 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 AC 1256 non-null float64 1 AEM 1256 non-null float64 2 AGI 1256 non-null float64 3 AQN 1256 non-null float64 4 ATS 1256 non-null float64 5 BB 1256 non-null float64 6 BCE 1256 non-null float64 7 BHC 1256 non-null float64 8 BLDP 1256 non-null float64 9 BLX 1256 non-null float64 10 BMO 1256 non-null float64 11 BN 1256 non-null float64 12 BNS 1256 non-null float64 13 BTE 1256 non-null float64 14 BTO 1256 non-null float64 15 BYD 1256 non-null float64 16 CAE 1256 non-null float64 17 CCO 1256 non-null float64 18 CG 1256 non-null float64 19 CIGI 1256 non-null float64 20 CIX 1256 non-null float64 21 CLS 1256 non-null float64 22 CM 1256 non-null float64 23 CNQ 1256 non-null float64 24 CP 1256 non-null float64 25 CVE 1256 non-null float64 26 CWB 1256 non-null float64 27 DOL 1256 non-null float64 28 DOO 1256 non-null float64 29 EFR 1256 non-null float64 30 ELD 1256 non-null float64 31 ENB 1256 non-null float64 32 EQX 1256 non-null float64 33 ERO 1256 non-null float64 34 FM 1256 non-null float64 35 FNV 1256 non-null float64 36 FR 1256 non-null float64 37 FSV 1256 non-null float64 38 FTS 1256 non-null float64 39 GIL 1256 non-null float64 40 GOOS 1256 non-null float64 41 GSY 1256 non-null float64 42 H 1256 non-null float64 43 HBM 1256 non-null float64 44 IAG 1256 non-null float64 45 IGM 1256 non-null float64 46 IMO 1256 non-null float64 47 K 1256 non-null float64 48 KEY 1256 non-null float64 49 L 1256 non-null float64 50 LAAC 1256 non-null float64 51 MAG 1256 non-null float64 52 MFC 1256 non-null float64 53 MG 1256 non-null float64 54 MX 1256 non-null float64 55 NAN 1256 non-null float64 56 NG 1256 non-null float64 57 NGD 1256 non-null float64 58 NTR 1256 non-null float64 59 NXE 1256 non-null float64 60 OGC 1256 non-null float64 61 OR 1256 non-null float64 62 OSK 1256 non-null float64 63 OTEX 1256 non-null float64 64 PAAS 1256 non-null float64 65 PBH 1256 non-null float64 66 PD 1256 non-null float64 67 PEY 1256 non-null float64 68 PPL 1256 non-null float64 69 PRMW 1256 non-null float64 70 PSI 1256 non-null float64 71 PSK 1256 non-null float64 72 QSR 1256 non-null float64 73 RY 1256 non-null float64 74 SAP 1256 non-null float64 75 SHOP 1256 non-null float64 76 SII 1256 non-null float64 77 SIL 1256 non-null float64 78 SJ 1256 non-null float64 79 SLF 1256 non-null float64 80 SPB 1256 non-null float64 81 SSL 1256 non-null float64 82 SSRM 1256 non-null float64 83 STN 1256 non-null float64 84 SU 1256 non-null float64 85 T 1256 non-null float64 86 TD 1256 non-null float64 87 TFII 1256 non-null float64 88 TLRY 1256 non-null float64 89 TPZ 1256 non-null float64 90 TRI 1256 non-null float64 91 TRP 1256 non-null float64 92 TVE 1256 non-null float64 93 TXG 1256 non-null float64 94 VET 1256 non-null float64 95 WCN 1256 non-null float64 96 WFG 1256 non-null float64 97 WPM 1256 non-null float64 98 X 1256 non-null float64 dtypes: float64(99) memory usage: 981.2+ KB Data statics summary
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| AC | 1256.0 | 36.501842 | 3.103101 | 25.417490 | 34.364492 | 36.198771 | 38.322836 | 61.728199 |
| AEM | 1256.0 | 53.826276 | 9.554856 | 32.276482 | 47.322629 | 52.197571 | 58.678662 | 83.169998 |
| AGI | 1256.0 | 9.527597 | 3.251711 | 3.709501 | 7.324404 | 8.330248 | 11.803682 | 20.690001 |
| AQN | 1256.0 | 9.877426 | 2.822586 | 4.757608 | 6.719469 | 10.848601 | 12.409449 | 14.397230 |
| ATS | 1256.0 | 28.954682 | 10.538851 | 10.000000 | 17.435000 | 31.440001 | 38.165001 | 48.730000 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| VET | 1256.0 | 11.502070 | 5.511316 | 1.587390 | 7.007686 | 11.738775 | 14.317163 | 28.071140 |
| WCN | 1256.0 | 124.517144 | 25.728199 | 69.163422 | 99.587046 | 127.753422 | 137.964344 | 186.500000 |
| WFG | 1256.0 | 67.813160 | 18.790793 | 14.714417 | 59.307963 | 73.519878 | 80.926613 | 98.821205 |
| WPM | 1256.0 | 41.093440 | 8.054813 | 22.332947 | 37.355909 | 41.533010 | 45.807121 | 62.560001 |
| X | 1256.0 | 23.344395 | 10.808583 | 4.768804 | 16.141821 | 23.383692 | 29.996864 | 49.411919 |
99 rows × 8 columns
def plot_assets_distribution(df,xlabel, ylabel, title=''):
# Define the number of assets
n_assets = df.shape[1]
# Create subplots
fig, axes = plt.subplots(1, n_assets, figsize=(23, 3))
if n_assets == 1:
axes = [axes]
# Iterate over each asset
for i, asset in enumerate(df.columns):
g =sns.histplot(df[asset], kde=True, ax=axes[i])
axes[i].set_title(f'{title + asset}')
axes[i].set_xlabel(xlabel)
axes[i].set_ylabel(ylabel)
# Calculate and display statistics
mean_return = df[asset].mean()
std_dev = df[asset].std()
skewness = df[asset].skew()
kurtosis = df[asset].kurtosis()
# Add statistics below the plot
statistics = (f"Mean: {mean_return:.4f}\n"
f"Std Dev: {std_dev:.4f}\n"
f"Skewness: {skewness:.4f}\n"
f"Kurtosis: {kurtosis:.4f}")
# Place the text under the plot
axes[i].text(0.3, -0.3, statistics, transform=axes[i].transAxes,
fontsize=10, verticalalignment='top', bbox=dict(boxstyle="round,pad=0.3", edgecolor="black", facecolor="lightgrey"))
def normalize_asset_daily_price(price_df):
return (price_df / price_df.iloc[0])*100
def plot_normalize_asset_daily_price(p_normalized_asset_daily_price_df,number_of_asset):
i_normalized_asset_daily_price_df = p_normalized_asset_daily_price_df.iloc[:,:number_of_asset]
#normalized_asset_daily_price_df = (normalized_asset_daily_price_df / normalized_asset_daily_price_df.iloc[0])*100
#normalized_asset_cols_size = len(normalized_asset_daily_price_df.columns)
i_normalized_asset_daily_price_df.plot(figsize = (15, 6))
#plt.show()
plot_assets_distribution(i_normalized_asset_daily_price_df, 'Adjusted Close Price','Frequency')
def calculate_stock_price_log_return(index_adj_close_price_df):
log_returns = np.log(index_adj_close_price_df / index_adj_close_price_df.shift(1))
log_returns = log_returns.dropna(how = 'all')
return log_returns
#removing asset with negative expected return
def removing_assets_with_negative_expected_return(log_returns,threshold):
# Calculate the correlation matrix
#corr_matrix = expected_returns.corr()
# Create a list to store uncorrelated assets
assets_with_positive_expected_return = []
# Iterate through the correlation matrix
for asset in log_returns.columns:
# Check if the asset is uncorrelated with all other assets
#for other_assets in corr_matrix.columns:
if log_returns.mean()[asset] > threshold:
assets_with_positive_expected_return.append(asset)
assets_with_positive_expected_return_list = list(dict.fromkeys(assets_with_positive_expected_return))
return assets_with_positive_expected_return_list
def positive_assets_log_returns_df(log_returns_df, positive_assets_list):
return log_returns_df[positive_assets_list]
def stocks_initial_price(positive_assets_list):
return index_adj_close_price_df.iloc[0][positive_assets_list]
def generate_asset_volatility(frequency_date_column, log_return_df):
frequency = frequency_date_column[0].upper()
assets_volatility_df = log_return_df.rolling(center=False,window= 252).std() * np.sqrt(252)
for col in list(assets_volatility_df.columns):
assets_volatility_df = assets_volatility_df.rename(columns={col: col+' Volatility'})
assets_volatility_df = assets_volatility_df.dropna(axis=0)
assets_volatility_df[frequency_date_column] = pd.to_datetime(assets_volatility_df.index, format = '%m/%Y')
assets_volatility_df[frequency_date_column] = assets_volatility_df[frequency_date_column].dt.to_period(frequency)
assets_volatility_df.set_index(frequency_date_column, inplace=True)
assets_volatilities = assets_volatility_df.groupby(frequency_date_column).mean()
#assets_volatilities = round(assets_volatilities,4)
assets_volatilities = assets_volatilities.dropna(axis=0)
return assets_volatilities
def plotting_assets_log_returns(df,xlabel, ylabel, title=''):
# Define the number of assets
n_assets = df.shape[1]
# Create subplots
fig, axes = plt.subplots(1, n_assets, figsize=(20, 3))
if n_assets == 1:
axes = [axes]
for i, column in enumerate(df.columns):
axes[i].plot(df[column], label=column)
axes[i].set_title(f'{column}')
axes[i].set_xlabel(xlabel)
axes[i].set_ylabel(ylabel)
# Set common labels
plt.xlabel(xlabel, fontsize=12)
#plt.tight_layout()
#plt.show()
def plotting_assets_volatility(df,xlabel, ylabel, title=''):
# Define the number of assets
n_assets = df.shape[1]
# Create subplots
fig, axes = plt.subplots(1, n_assets, figsize=(30, 3))
if n_assets == 1:
axes = [axes]
df.index= df.index.to_timestamp()
#df.index = date.dt.strftime('%Y')
for i, ticker in enumerate(df.columns):
axes[i].plot(df.index, df[ticker], label=ticker)
axes[i].set_title(f'{ticker}')
axes[i].set_xlabel(xlabel)
axes[i].set_ylabel(ylabel)
plt.xlabel(xlabel, fontsize=8)
plt.tight_layout(pad=0.05, w_pad=0.01, h_pad=1.0)
#plt.show()
def portfolio_arihtmetics(log_returns,stocks_initial_prices):
return pd.DataFrame({'mu expected_return':log_returns.mean(),
'variance':log_returns.var(),
'Sigmas(volatilities)':log_returns.std(),
'modifiy shape(Er)/𝝈':log_returns.mean()/log_returns.std(),
'initial price': stocks_initial_prices}).transpose()
number_of_asset =5
normalized_asset_daily_price_df = normalize_asset_daily_price(index_adj_close_price_df)
stock_price_log_return = calculate_stock_price_log_return(normalized_asset_daily_price_df)
log_returns = positive_assets_log_returns_df(stock_price_log_return,
removing_assets_with_negative_expected_return(stock_price_log_return,0))
asset_volatility_df = generate_asset_volatility('Quarter', log_returns)
positive_assets_list = removing_assets_with_negative_expected_return(stock_price_log_return,0)
stocks_initial_prices = stocks_initial_price(positive_assets_list)
portfolio_arihtmetics_df = portfolio_arihtmetics(log_returns,stocks_initial_prices).transpose()
print('\nExploratory Data Analysis (EDA)\n')
#index_adj_close_price_df.iloc[0] # first row
asset_daily_price(index_adj_close_price_df,number_of_asset) #Plotting the first 5 assets daily adj closed prices
plot_normalize_asset_daily_price(normalized_asset_daily_price_df,number_of_asset) #Normalization of adj closed prices to 100
Exploratory Data Analysis (EDA) Plotting the first 5 assets daily adj closed prices
The graphs show that the distribution of closing stock prices has the following characteristics:
-- Non-stationarity: Stock prices tend to increase over time, making the distribution dynamic and time-driven. This is in contrast to normal distributions which are stationary.
-- Fat tails (leptokurtosis): Stock price distributions often have more extreme values (fat tails) compared to a normal distribution. This means that there are more significant price changes than a normal distribution would predict.
-- Skewedness: Stock prices can be asymmetrical, meaning that they are not symmetrical. For example, there may be more upward or downward price movements.
This leads us to calculate the logarithmic returns of assets
In this section, we will calculate the assets log return instead of arithmetic return. The arithmetic return is the percentage change in the asset's price from one period to the next where as the log return of an asset over a period is calculated as the natural logarithm of the ratio of the ending price to the starting price.
$$ Arithmetic Return: R = \frac{P_t - P_{t-1}}{P_{t-1}} $$. $$ Log Return: \text{Log Return} = \ln\left(\frac{P_t}{P_{t-1}}\right) $$.
Throughout this project, we will use asset log returns instead of arithmetic returns, simply because, in the upcoming sections we will perform stochastic simulation of the stock prices to calculate Profit & Loss, VaR, CVaR and stress testing. Log returns are commonly used in the financial literature to perform financial modeling like asset prices modeling over time, as prices cannot be negative but can increase indefinitely. Log returns are normally distributed with Fat-Tailed that make them more likely to predict extreme returns than assuming arithmetic returns to be normally distributed. As we know, stocks are traded with very high frequency over very short period of time and the form of their distributions are unknown as we can see in the plottings above. This leads to use log returns witch naturally account for continuous compounding and more accurate instead of arithmetic returns witch are based on simple interest. Furthermore, as opposed to arithmetic returns, log returns are additive meaning that you can add log returns over multiple periods to get the total log return.
print('\nAssets log return data frame\n')
display(log_returns)
print('\nAssets volatility data frame\n')
display(asset_volatility_df)
print('\nPortfolio arithmetics\n')
display(portfolio_arihtmetics_df)
#print('\nAssets log returns Distribution\n')
plot_assets_distribution(log_returns.iloc[:,:number_of_asset], 'log_returns','Frequency')
#print('\nAssets Volatility Distribution\n')
plot_assets_distribution(asset_volatility_df.iloc[:,:number_of_asset], 'Volatility','Frequency')
plotting_assets_log_returns(log_returns.iloc[:,:number_of_asset], 'Date','log_returns')
plotting_assets_volatility(asset_volatility_df.iloc[:,:number_of_asset], 'Date','Volatility')
Assets log return data frame
| AEM | AGI | ATS | BLX | BMO | BN | BNS | BTE | BTO | BYD | ... | TD | TFII | TPZ | TRI | TRP | TVE | WCN | WFG | WPM | X | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||||||||||||||
| 2019-09-19 | 0.005959 | 0.028438 | 0.000000 | -0.005571 | 0.001910 | 0.010767 | 0.000534 | 0.018018 | -0.006152 | -0.014676 | ... | 0.005576 | 0.000000 | 0.000000 | 0.002668 | 0.004330 | 0.000000 | 0.003438 | 0.000000 | 0.009932 | -0.118386 |
| 2019-09-20 | 0.016807 | 0.015456 | -0.006543 | -0.003052 | 0.001362 | -0.004812 | 0.000889 | 0.046520 | -0.001235 | -0.019909 | ... | 0.002256 | 0.000000 | 0.004710 | -0.011914 | 0.015202 | 0.005486 | 0.003315 | 0.000000 | 0.002924 | -0.022863 |
| 2019-09-23 | 0.022427 | 0.019743 | 0.000000 | -0.002550 | -0.005186 | -0.014012 | 0.003372 | -0.005698 | -0.000927 | -0.008154 | ... | 0.000000 | 0.002291 | -0.001106 | 0.003440 | 0.005978 | -0.005879 | 0.003964 | -0.024681 | 0.024871 | 0.021053 |
| 2019-09-24 | 0.006364 | 0.008982 | -0.002922 | 0.002041 | -0.007554 | -0.010592 | 0.006709 | -0.064920 | -0.013699 | -0.022473 | ... | -0.003996 | 0.000000 | -0.005550 | 0.005657 | -0.002503 | -0.000786 | 0.003401 | 0.004599 | 0.012383 | -0.030347 |
| 2019-09-25 | -0.024847 | -0.053571 | -0.006606 | 0.012662 | 0.009195 | 0.008897 | 0.003864 | 0.018127 | 0.014626 | -0.008811 | ... | -0.002265 | 0.000000 | -0.000556 | 0.005183 | 0.000000 | -0.003942 | -0.003292 | 0.000000 | -0.036159 | 0.066812 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 0.011251 | 0.003309 | 0.032904 | 0.002256 | 0.008346 | 0.022295 | 0.015522 | -0.006536 | 0.004310 | 0.010163 | ... | 0.018057 | 0.002572 | 0.004999 | 0.014097 | 0.008106 | 0.003559 | 0.012948 | 0.004000 | 0.009954 | 0.048379 |
| 2024-09-10 | 0.014681 | 0.029302 | 0.003042 | -0.009706 | -0.001567 | 0.001483 | 0.003115 | -0.029952 | -0.006163 | -0.016714 | ... | -0.006529 | -0.013214 | 0.000000 | 0.016763 | -0.027570 | -0.001778 | -0.001731 | -0.002055 | 0.014411 | -0.049979 |
| 2024-09-11 | 0.002405 | 0.011167 | 0.006433 | -0.025685 | 0.017460 | 0.017833 | 0.006007 | 0.013423 | -0.001856 | -0.007953 | ... | 0.010587 | 0.030752 | -0.006113 | 0.003958 | 0.001964 | 0.001778 | 0.003837 | -0.008379 | -0.002359 | 0.067198 |
| 2024-09-12 | 0.034298 | 0.060360 | -0.011763 | 0.007311 | 0.009205 | 0.019594 | -0.000580 | 0.019803 | 0.007713 | 0.019516 | ... | 0.002751 | 0.001961 | 0.008879 | 0.009425 | 0.004567 | 0.000000 | 0.003231 | 0.017141 | 0.034818 | 0.039635 |
| 2024-09-13 | 0.015876 | 0.030923 | -0.018879 | 0.017723 | 0.005038 | 0.008339 | 0.005975 | -0.006557 | 0.005940 | 0.023286 | ... | 0.004996 | 0.000350 | 0.005510 | -0.006119 | 0.009931 | -0.000444 | -0.001668 | 0.025173 | 0.019205 | 0.037570 |
1255 rows × 79 columns
Assets volatility data frame
| AEM Volatility | AGI Volatility | ATS Volatility | BLX Volatility | BMO Volatility | BN Volatility | BNS Volatility | BTE Volatility | BTO Volatility | BYD Volatility | ... | TD Volatility | TFII Volatility | TPZ Volatility | TRI Volatility | TRP Volatility | TVE Volatility | WCN Volatility | WFG Volatility | WPM Volatility | X Volatility | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Quarter | |||||||||||||||||||||
| 2020Q3 | 0.504936 | 0.722582 | 0.400987 | 0.657471 | 0.495752 | 0.512884 | 0.446284 | 1.090984 | 0.781376 | 0.931228 | ... | 0.437350 | 0.490838 | 0.768609 | 0.306863 | 0.488613 | 0.108714 | 0.334995 | 0.692859 | 0.460984 | 0.751607 |
| 2020Q4 | 0.517322 | 0.735122 | 0.415200 | 0.669080 | 0.503352 | 0.532221 | 0.454041 | 1.117460 | 0.794128 | 0.938931 | ... | 0.442245 | 0.497988 | 0.774649 | 0.310427 | 0.501494 | 0.119606 | 0.334636 | 0.702167 | 0.479239 | 0.758829 |
| 2021Q1 | 0.507920 | 0.710529 | 0.436002 | 0.645300 | 0.475903 | 0.523339 | 0.433218 | 1.103703 | 0.761213 | 0.888265 | ... | 0.419510 | 0.540971 | 0.704425 | 0.304597 | 0.476976 | 0.115931 | 0.318027 | 0.680943 | 0.501728 | 0.815657 |
| 2021Q2 | 0.394552 | 0.498487 | 0.390192 | 0.364704 | 0.264166 | 0.330288 | 0.238358 | 0.833577 | 0.422179 | 0.507479 | ... | 0.231210 | 0.415803 | 0.250732 | 0.210103 | 0.268063 | 0.105479 | 0.188678 | 0.478729 | 0.437271 | 0.789773 |
| 2021Q3 | 0.368136 | 0.462163 | 0.378344 | 0.279486 | 0.202586 | 0.292768 | 0.183985 | 0.719885 | 0.339572 | 0.426810 | ... | 0.179408 | 0.412925 | 0.200002 | 0.198581 | 0.235408 | 0.111091 | 0.147088 | 0.403321 | 0.393466 | 0.728007 |
| 2021Q4 | 0.333092 | 0.418524 | 0.360240 | 0.238818 | 0.185123 | 0.262962 | 0.171322 | 0.673490 | 0.311613 | 0.411541 | ... | 0.167922 | 0.416557 | 0.186838 | 0.194944 | 0.209064 | 0.109803 | 0.145107 | 0.368686 | 0.351394 | 0.692851 |
| 2022Q1 | 0.327136 | 0.390095 | 0.346108 | 0.217548 | 0.186195 | 0.250669 | 0.166026 | 0.610134 | 0.335228 | 0.405959 | ... | 0.172636 | 0.357399 | 0.176881 | 0.187809 | 0.184952 | 0.112337 | 0.158590 | 0.352844 | 0.305565 | 0.606421 |
| 2022Q2 | 0.351677 | 0.394403 | 0.385216 | 0.209204 | 0.202487 | 0.282420 | 0.183523 | 0.602856 | 0.386554 | 0.394990 | ... | 0.197151 | 0.393487 | 0.196335 | 0.188245 | 0.198772 | 0.119470 | 0.178486 | 0.368329 | 0.297601 | 0.545716 |
| 2022Q3 | 0.387415 | 0.430022 | 0.432768 | 0.226257 | 0.224788 | 0.310316 | 0.205239 | 0.653420 | 0.404918 | 0.409088 | ... | 0.218834 | 0.423243 | 0.219101 | 0.197006 | 0.227495 | 0.121693 | 0.203167 | 0.421957 | 0.322888 | 0.542057 |
| 2022Q4 | 0.431849 | 0.460790 | 0.459860 | 0.256554 | 0.253014 | 0.344514 | 0.235036 | 0.668572 | 0.412929 | 0.408188 | ... | 0.242017 | 0.446166 | 0.233564 | 0.210808 | 0.269814 | 0.120017 | 0.226986 | 0.450482 | 0.363219 | 0.558721 |
| 2023Q1 | 0.427578 | 0.447831 | 0.460281 | 0.267589 | 0.255157 | 0.357661 | 0.245553 | 0.654676 | 0.383014 | 0.383679 | ... | 0.241562 | 0.433118 | 0.224136 | 0.212842 | 0.291662 | 0.134372 | 0.230723 | 0.454679 | 0.370638 | 0.548080 |
| 2023Q2 | 0.403774 | 0.420084 | 0.415433 | 0.289826 | 0.250314 | 0.348234 | 0.242711 | 0.624774 | 0.380361 | 0.338813 | ... | 0.235099 | 0.389740 | 0.205327 | 0.206625 | 0.291595 | 0.133922 | 0.217230 | 0.416975 | 0.363502 | 0.517238 |
| 2023Q3 | 0.368457 | 0.369703 | 0.360865 | 0.282366 | 0.234860 | 0.336513 | 0.230422 | 0.525978 | 0.362044 | 0.282926 | ... | 0.222803 | 0.363645 | 0.174805 | 0.213448 | 0.287583 | 0.121043 | 0.198279 | 0.345792 | 0.342282 | 0.523164 |
| 2023Q4 | 0.318009 | 0.329529 | 0.331048 | 0.265068 | 0.208913 | 0.325726 | 0.217096 | 0.471112 | 0.353394 | 0.267371 | ... | 0.203985 | 0.332438 | 0.144614 | 0.200450 | 0.252149 | 0.112219 | 0.182783 | 0.301782 | 0.304097 | 0.501732 |
| 2024Q1 | 0.301018 | 0.325923 | 0.323614 | 0.270098 | 0.203953 | 0.309186 | 0.209275 | 0.447827 | 0.351019 | 0.262426 | ... | 0.200144 | 0.324681 | 0.126838 | 0.194390 | 0.220564 | 0.098177 | 0.176920 | 0.300753 | 0.290663 | 0.502309 |
| 2024Q2 | 0.294487 | 0.334375 | 0.331103 | 0.256207 | 0.202602 | 0.290805 | 0.199596 | 0.422513 | 0.285629 | 0.286738 | ... | 0.192505 | 0.303650 | 0.120100 | 0.196177 | 0.203598 | 0.085080 | 0.173198 | 0.295778 | 0.294560 | 0.487154 |
| 2024Q3 | 0.300283 | 0.336350 | 0.331269 | 0.264102 | 0.216194 | 0.287105 | 0.195979 | 0.420838 | 0.257450 | 0.309874 | ... | 0.187867 | 0.282098 | 0.133813 | 0.181548 | 0.182768 | 0.083004 | 0.171691 | 0.296492 | 0.296150 | 0.438869 |
17 rows × 79 columns
Portfolio arithmetics
| mu expected_return | variance | Sigmas(volatilities) | modifiy shape(Er)/𝝈 | initial price | |
|---|---|---|---|---|---|
| AEM | 0.000403 | 0.000602 | 0.024527 | 0.016433 | 50.151981 |
| AGI | 0.000994 | 0.000922 | 0.030372 | 0.032740 | 5.939981 |
| ATS | 0.000496 | 0.000570 | 0.023885 | 0.020757 | 13.800000 |
| BLX | 0.000616 | 0.000562 | 0.023698 | 0.025986 | 14.192307 |
| BMO | 0.000303 | 0.000350 | 0.018704 | 0.016185 | 58.515537 |
| ... | ... | ... | ... | ... | ... |
| TVE | 0.000003 | 0.000047 | 0.006884 | 0.000416 | 22.429239 |
| WCN | 0.000606 | 0.000192 | 0.013850 | 0.043764 | 86.776352 |
| WFG | 0.000659 | 0.000805 | 0.028367 | 0.023226 | 39.590134 |
| WPM | 0.000719 | 0.000529 | 0.023003 | 0.031271 | 25.364801 |
| X | 0.000873 | 0.001493 | 0.038644 | 0.022590 | 12.059963 |
79 rows × 5 columns
In this section, we will stack Principal Component Analysis (PCA), Correlation Analysis and Hierarchical Clustering methods to create a diversified portfolio containing only the most important assets with less correlation. The Principal Component Analysis (PCA) is a dimensionality reduction technique aimed at reducing the number of assets. The PCA process will take the log returns of the assets as input and will produce a correlation matrix as output by transforming the original set of assets into a smaller set of uncorrelated variables called principal components. These components capture the majority of the variance in the data. The correlation analysis process will use the correlation matrix produced by PCA and will analyze the correlation between the most important assets selected by PCA. the highly correlated assets that may be redundant will be dropped. The remaining assets are expected to maintain a well-diversified portfolio.
def generate_correlation_matrix(log_returns):
return log_returns.corr(method='pearson')
def get_selected_assets_volatility(assets_volatility_df, selected_content_ticker_list):
for col in list(assets_volatility_df.columns):
assets_volatility_df = assets_volatility_df.rename(columns={col: col.replace(' Volatility', '')})
return assets_volatility_df[selected_content_ticker_list]
#Selecting most important economic factors
#-------------------------------------------------------------------------------
#Principal Components Analysis(PCA) to select most importance assets
#-------------------------------------------------------------------------------
def selecting_important_item_PCA_treshold_method(matrix,threshold):
return matrix[(matrix.abs() > threshold).any(axis=1)].index.to_list()
def selecting_important_item_corr_treshold_method(matrix,threshold):# to be change to >=matrix < threshold).any(axis=1)
return matrix[(matrix < threshold).any(axis=1)].index.to_list()
def setting_PCA_for_assets_selection(log_returns_df):
# economic indicators dataset
# Standardizing the data
scaler = StandardScaler()
scaled_data_df = scaler.fit_transform(log_returns_df)
# Applying PCA
all_pca = PCA(n_components=None) # Use all components to find the best number of important indicators
all_principal_components = all_pca.fit_transform(scaled_data_df)
# Explained variance
explained_variance = all_pca.explained_variance_ratio_
# Principal Component Loadings(coefficients)
loadings_matrix = all_pca.components_
# Create a DataFrame for loadings
loadings_matrix_df = pd.DataFrame(loadings_matrix.T, columns=[f'PC{i+1}' for i in range(loadings_matrix.shape[0])],
index=log_returns.columns)
return loadings_matrix_df, explained_variance
#----------------------
def get_num_components(explained_variance,cumulative_variance_treshold = 0.9):
# Determine the number of components explaining the cumulative varience treshold of the variance
cumulative_variance = explained_variance.cumsum()
return (cumulative_variance <= cumulative_variance_treshold).sum() + 1
def select_top_components_df(loadings_matrix_df, num_components, threshold_for_high_loadings = 0.5):
# Select top components
return loadings_matrix_df.iloc[:, :num_components]
def select_top_indicators_df(loadings_matrix_df, num_components, threshold_for_high_loadings = 0.5):
# Select top components
selected_components_df = loadings_matrix_df.iloc[:, :num_components]
# Find indicators with high loadings
return selected_components_df[(selected_components_df.abs() > threshold_for_high_loadings).any(axis=1)]
def plot_explained_variance_for_assets_selection(loadings_matrix_df, explained_variance):
# Print explained variance
explained_variance_df = pd.DataFrame(explained_variance).T
explained_variance_df.columns = loadings_matrix_df.columns
print('\nexplained_variance_df\n')
display(explained_variance_df)
# Plotting the explained variance
plt.figure(figsize=(10, 6))
plt.bar(range(1, len(explained_variance) + 1), explained_variance, alpha=0.5, align='center', label='individual explained variance')
plt.step(range(1, len(explained_variance) + 1), np.cumsum(explained_variance), where='mid', label='cumulative explained variance')
plt.xlabel('Principal Components')
plt.ylabel('Explained Variance Ratio')
plt.title('Explained Variance by Principal Components')
plt.legend(loc='best')
#plt.show()
#----------------------
def print_explained_variance(loadings_matrix_df, explained_variance,cumulative_variance_treshold, num_components, threshold_for_highest_loadings):
# Print explained variance
print('\nloadings_matrix_df\n')
display(loadings_matrix_df)
num_components = get_num_components(explained_variance,cumulative_variance_treshold)
top_components_df = select_top_components_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
print('\ntop_components_df\n')
display(top_components_df)
print('\nMost important assets with top components\n')
top_indicators_df = select_top_indicators_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
display(top_indicators_df)
#def get_all_assets_corr_matrix(log_returns_df, cumulative_variance_treshold = 1, threshold_for_highest_loadings = 0.5 ):
# all_assets_matrix = generate_correlation_matrix(log_returns_df)
# return all_assets_matrix
def get_most_important_assets_log_returns_df_PCA_method(log_returns_df, most_important_assets_log_returns_list_PCA):
return log_returns_df[most_important_assets_log_returns_list_PCA]
def get_most_important_assets_corr_matrix_PCA_method(most_important_assets_log_returns_df_PCA_method):
#PCA to select most important portfolio assets
return generate_correlation_matrix(most_important_assets_log_returns_df_PCA_method)
#----------------------------------------------------------------------------------------------------------
#Stacking Correlation Analysis and Principal Components Analysis(PCA) to select most divesified assets
#-------------------------------------------------------------------------------------------------------------
def get_most_diversify_portfolio_asset_log_return_df_stacking_PCA_and_corr(log_returns_df,most_diversify_portfolio_assets_list_stacking_PCA_and_corr):
return log_returns_df[most_diversify_portfolio_assets_list_stacking_PCA_and_corr]
def get_stacking_PCA_and_corr_method_matrix_to_diversify_portfolio(most_diversify_portfolio_assets_df_PCA_corr_method):
return generate_correlation_matrix(most_diversify_portfolio_assets_df_PCA_corr_method)
#---------------------------------------------------------------------------------------------------------------------------------
#Stacking Hierarchical Clustering, Correlation Analysis and Principal Components Analysis(PCA) to select most divesified assets
#---------------------------------------------------------------------------------------------------------------------------------
def get_most_diversify_portfolio_asset_hierarchical_clustering_method(returns, g, distance_threshold= 1.5):
# We will use this method to efficiently reduce the number of assets in our portfolio
# by selecting a representative asset from each cluster identified in the clustermap
# Extract the linkage matrix from the clustermap
linkage_matrix = g.dendrogram_row.linkage
# Get cluster assignments
clusters = fcluster(linkage_matrix, t=distance_threshold, criterion='distance')
# The number of clusters
num_clusters = len(np.unique(clusters))
# Display the cluster assignments
asset_clusters = pd.DataFrame({'Asset': returns.columns[g.dendrogram_row.reordered_ind], 'Cluster': clusters})
return asset_clusters
# Function to find the asset closest to the centroid of each cluster:this function is generated by chartGPT
def select_most_divesified_portfolio_assets_stacking_hierarchical_clustering_method(log_returns_df, asset_clusters):
most_divesified_portfolio_assets_list_stacking_hierarchical_clustering_method = []
for cluster in asset_clusters['Cluster'].unique():
cluster_assets = asset_clusters[asset_clusters['Cluster'] == cluster]['Asset']
cluster_returns = log_returns_df[cluster_assets].mean(axis=1) # Compute the centroid
distances = log_returns_df[cluster_assets].apply(lambda x: np.linalg.norm(x - cluster_returns), axis=0)
representative_asset = distances.idxmin()
most_divesified_portfolio_assets_list_stacking_hierarchical_clustering_method.append(representative_asset)
return most_divesified_portfolio_assets_list_stacking_hierarchical_clustering_method
def create_clustermap(assets_matrix):
g = sns.clustermap(assets_matrix, method = 'ward', metric='euclidean', cmap = 'RdBu', annot = True, annot_kws = {'size': 8},
row_cluster=True, col_cluster=True)
plt.close()
return g
def select_most_divesified_portfolio_assets_df_stacking_hierarchical_clustering_method(log_returns_df,
most_divesified_portfolio_assets_list_stacking_hierarchical_clustering_method):
return log_returns_df[most_divesified_portfolio_assets_list_stacking_hierarchical_clustering_method]
def select_most_divesified_portfolio_assets_matrix_stacking_hierarchical_clustering_method(most_divesified_portfolio_assets_df_stacking_hierarchical_clustering_method):
return generate_correlation_matrix(most_divesified_portfolio_assets_df_stacking_hierarchical_clustering_method)
#------------------------------------------------------------------------------------------------------------------------------------------
def plotting_selected_assets_corr_mat_clustermap(assets_matrix, title, dendrogram = True):
g = sns.clustermap(assets_matrix, method = 'ward', metric='euclidean', cmap = 'RdBu', annot = True, annot_kws = {'size': 8},
row_cluster=dendrogram, col_cluster=dendrogram)
plt.subplots_adjust(top=0.85)
plt.setp(g.ax_heatmap.get_xticklabels(), rotation=90)
plt.setp(g.ax_heatmap.get_yticklabels(), rotation=360)
g.cax.set_position([1.02, 0.2, 0.03, 0.4]) # [left, bottom, width, height]
g.cax.set_ylabel('Correlation Coefficient', rotation=270, labelpad=15) # Rotate label
g.fig.suptitle(title, y=0.9, fontsize=12)
#----------------------------------------------------------------------------------------------------------
# Most divesified Assets Daily Volatility
#-------------------------------------------------------------------------------------------------------------
#selected assets daily volatility
def get_selected_assets_volatility_df_from_Stack_Corr_PCA_method(selected_assets_adj_close_price_log_return_df, frequency_date_column = 'day'):
frequency = frequency_date_column[0].upper()
#Market volatility
selected_assets_volatility_df = selected_assets_adj_close_price_log_return_df.rolling(center=False,window= 252).std() * np.sqrt(252)
for col in list(selected_assets_volatility_df.columns):
selected_assets_volatility_df = selected_assets_volatility_df.rename(columns={col: col+' Volatility'})
selected_assets_volatility_df = selected_assets_volatility_df.dropna(axis=0)
if frequency == 'D':
selected_assets_volatilities = selected_assets_volatility_df
else:
selected_assets_volatility_df[frequency_date_column] = pd.to_datetime(selected_assets_volatility_df.index, format = '%m/%Y')
selected_assets_volatility_df[frequency_date_column] = selected_assets_volatility_df[frequency_date_column].dt.to_period(frequency)
#market_adj_close_price_log_return_frequency_df = market_volatility_df
selected_assets_volatility_df.set_index(frequency_date_column, inplace=True)
selected_assets_volatilities = selected_assets_volatility_df.groupby(frequency_date_column).mean()
selected_assets_volatilities = round(selected_assets_volatilities,1)
selected_assets_volatilities = selected_assets_volatilities.dropna(axis=0)
return selected_assets_volatilities
#-------------------------portfolio arithmetics-summary------------------------
#merge content data frame and the weght data frame
def most_diversified_portfolio_arithmetics(most_divesified_portfolio_arihtmetics_df, index_content_df ):
most_divesified_portfolio_arihtmetics_df_reset = most_divesified_portfolio_arihtmetics_df.reset_index()
most_divesified_portfolio_arihtmetics_df_reset.rename(columns={'index': 'Ticker'}, inplace=True)
most_divesified_portfolio_arihtmetics_df_details = pd.merge(most_divesified_portfolio_arihtmetics_df_reset, index_content_df, how="inner", on=["Ticker"])
return most_divesified_portfolio_arihtmetics_df_details
#----------------------------------------------- Plot the scatter matrix with regression lines------------------------------------------
def plot_scatter_matrix(df):
#sns.pairplot(df, kind='reg', height=3, aspect=3)
#height=3, aspect=1.2
g = sns.pairplot(df, kind='reg', height=3, aspect=1.2)
g.fig.set_size_inches(12, 8)
#plt.suptitle("Pairplot with Regression Lines", y=1.02, fontsize=12)
plt.suptitle("Scatter Matrix for Stock Prices with Regression Lines", y=1.02, fontsize=12, fontweight='bold', color='blue')
# Adjust the axis font size
plt.tick_params(axis='both', which='major', labelsize=50)
#plt.show()
#------------------------------------------------------plotting portfolio structure----------
def plot_portfolio_structure_pie_chart(most_divesified_portfolio_arihtmetics_df_details):
most_divesified_portfolio_arihtmetics_df_details = most_divesified_portfolio_arihtmetics_df_details.sort_values(by='modifiy shape(Er)/𝝈',ascending=True)
industry_labels = most_divesified_portfolio_arihtmetics_df_details['Industry'].values
sector_labels = most_divesified_portfolio_arihtmetics_df_details['Sector'].values
modifiy_sharpe_values = most_divesified_portfolio_arihtmetics_df_details['modifiy shape(Er)/𝝈'].values
# Create subplots: use 'domain' type for Pie subplot
fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type':'domain'}]])
fig.add_trace(go.Pie(labels=industry_labels, values=modifiy_sharpe_values, name="Industry",
legendgroup="Industry", # this can be any string, not just "group"
legendgrouptitle_text="Industry"), 1, 1)
fig.add_trace(go.Pie(labels=sector_labels, values=modifiy_sharpe_values, name="Sector",
legendgroup="Sector", # this can be any string, not just "group"
legendgrouptitle_text="Sector"), 1, 2)
# Use `hole` to create a donut-like pie chart
fig.update_traces(hole=.5, hoverinfo="label+percent+name")
fig.update_layout(
title_text=" Asset - Risk-Adjusted Return (modifiy sharpe(Er)/risk by Industry & Sector)",
# Add annotations in the center of the donut pies.
annotations=[dict(text='Industry', x=0.14, y=0.5, font_size=20, showarrow=False),
dict(text='Sector', x=0.84, y=0.5, font_size=20, showarrow=False)],
height=500,
width=800,
autosize=True,
margin=dict(t=0, b=0, l=50, r=0),
legend_tracegroupgap = 0,
legend=dict(
orientation="v",
yanchor="bottom",
y=0,
xanchor="right",
x=1.5),
title=dict(
y=0.9,
x=0.1,
xanchor= 'left',
yanchor= 'top'))
fig.show()
def plot_portfolio_structure( p_most_divesified_portfolio_arihtmetics_df_details):
fig, ax =plt.subplots(figsize=(12, 6))
l_most_divesified_portfolio_arihtmetics_df_details = p_most_divesified_portfolio_arihtmetics_df_details.sort_values(by='modifiy shape(Er)/𝝈',ascending=True)
column_list = [': ' for i in range(len(l_most_divesified_portfolio_arihtmetics_df_details))]
column_df = pd.DataFrame({'colum': column_list})
modifiy_sharpe_values = l_most_divesified_portfolio_arihtmetics_df_details['modifiy shape(Er)/𝝈']
Tickers = l_most_divesified_portfolio_arihtmetics_df_details['Sector'] + column_df['colum'] + \
l_most_divesified_portfolio_arihtmetics_df_details['Industry'] + column_df['colum'] + \
l_most_divesified_portfolio_arihtmetics_df_details['Company'] + \
column_df['colum'] + l_most_divesified_portfolio_arihtmetics_df_details['Ticker']
bar_container= ax.barh(Tickers, modifiy_sharpe_values*100)
ax.axes.get_xaxis().set_visible(False)
# setting label of y-axis
ax.set_ylabel("Asset Tickers")
# setting label of x-axis
ax.set_xlabel("Asset modifiy sharpe(Er)/𝝈")
ax.set_title(" Most Diversified portfolio Structure: Asset Risk-Adjusted Return (modifiy sharpe(Er)/risk)",fontsize=22, horizontalalignment='right',fontweight='roman')
ax.bar_label(bar_container, fmt='{:,.1f}%')
plt.show()
#Asset return pie chart
plot_portfolio_structure_pie_chart( p_most_divesified_portfolio_arihtmetics_df_details)
# Data setting-
cumulative_variance_treshold = 1.0
threshold_for_highest_loadings = 0.5
correlation_coefficient_treshold = 0.3
distance_threshold= 0.5 # parameter to determining the number of clusters in hierarchical clustering.
#-------------------------------------PCA_method----------------------
loadings_matrix_df, explained_variance = setting_PCA_for_assets_selection(log_returns)
num_components = get_num_components(explained_variance,cumulative_variance_treshold)
top_components_df = select_top_components_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
top_indicators_df = select_top_indicators_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
most_important_assets_list_PCA_treshold_method = selecting_important_item_PCA_treshold_method(top_indicators_df, threshold_for_highest_loadings)
most_important_assets_log_returns_df_PCA_method = get_most_important_assets_log_returns_df_PCA_method(log_returns, most_important_assets_list_PCA_treshold_method)
most_important_assets_corr_matrix_PCA_method = get_most_important_assets_corr_matrix_PCA_method(most_important_assets_log_returns_df_PCA_method)
#----------------------------------------------------Correlation method-------------------------
selected_assets_list_correlation_method = selecting_important_item_corr_treshold_method(most_important_assets_corr_matrix_PCA_method,
correlation_coefficient_treshold)
selected_assets_log_return_df_correlation_method = log_returns[selected_assets_list_correlation_method]
selected_assets_log_return_Corr_matrix_correlation_method = generate_correlation_matrix(selected_assets_log_return_df_correlation_method)
#-------------------------stack_PCA_corr_method---------------------------------------------------------------
most_diversify_portfolio_assets_list_stack_PCA_corr_method = selecting_important_item_corr_treshold_method(selected_assets_log_return_Corr_matrix_correlation_method,
correlation_coefficient_treshold)
most_diversify_portfolio_asset_log_return_df_stacking_PCA_and_corr = get_most_diversify_portfolio_asset_log_return_df_stacking_PCA_and_corr(log_returns,
most_diversify_portfolio_assets_list_stack_PCA_corr_method)
most_diversify_portfolio_assets_matrix_stacking_PCA_and_corr_method = \
get_stacking_PCA_and_corr_method_matrix_to_diversify_portfolio(most_diversify_portfolio_asset_log_return_df_stacking_PCA_and_corr )
#------------------------------stacking_hierarchical_clustering_method------------------------------------
#get clustermap
clustermap= create_clustermap(most_diversify_portfolio_assets_matrix_stacking_PCA_and_corr_method);
#get asset clusters
asset_clusters = get_most_diversify_portfolio_asset_hierarchical_clustering_method(most_diversify_portfolio_asset_log_return_df_stacking_PCA_and_corr,
clustermap, distance_threshold)
# Get the representative assets
most_diversify_portfolio_assets_list = \
select_most_divesified_portfolio_assets_stacking_hierarchical_clustering_method(most_diversify_portfolio_asset_log_return_df_stacking_PCA_and_corr,
asset_clusters)
most_diversify_portfolio_assets_log_returns_df = \
select_most_divesified_portfolio_assets_df_stacking_hierarchical_clustering_method(log_returns, most_diversify_portfolio_assets_list)
most_diversify_portfolio_assets_corr_matrix = \
select_most_divesified_portfolio_assets_matrix_stacking_hierarchical_clustering_method(most_diversify_portfolio_assets_log_returns_df)
selected_assets_volatility_df_stacking_corr_PCA_method = \
get_selected_assets_volatility_df_from_Stack_Corr_PCA_method(most_diversify_portfolio_assets_log_returns_df, frequency_date_column = 'day')
most_diversify_portfolio_assets_initial_prices = stocks_initial_prices[most_diversify_portfolio_assets_list]
most_divesified_portfolio_arihtmetics_df = portfolio_arihtmetics(most_diversify_portfolio_assets_log_returns_df,
most_diversify_portfolio_assets_initial_prices).transpose()
most_divesified_portfolio_arihtmetics_df_details = most_diversified_portfolio_arithmetics(most_divesified_portfolio_arihtmetics_df, index_content_df)
#-------Data printing and recording --------------------------------------------------
warnings.filterwarnings("ignore")
print('\nInitial assets log returns\n')
display(log_returns)
plot_explained_variance_for_assets_selection(loadings_matrix_df, explained_variance)
print_explained_variance(loadings_matrix_df, explained_variance,cumulative_variance_treshold, num_components, threshold_for_highest_loadings)
print('\nMost Important Assets Log returns using PCA\n')
display(most_important_assets_log_returns_df_PCA_method)
plotting_selected_assets_corr_mat_clustermap(most_important_assets_corr_matrix_PCA_method, 'Most Important Assets Correlation Matrix PCA Method')
print('\nMost Important Assets Log returns using Correlation method\n')
display(selected_assets_log_return_df_correlation_method)
plotting_selected_assets_corr_mat_clustermap(selected_assets_log_return_Corr_matrix_correlation_method,
'Most Diversified Assets Correlation Matrix - Correlation Method')
print('\nMost Diversified Assets Log returns using Stack Correlation Matrix/PCA Method\n')
display(most_diversify_portfolio_asset_log_return_df_stacking_PCA_and_corr)
plotting_selected_assets_corr_mat_clustermap(most_diversify_portfolio_assets_matrix_stacking_PCA_and_corr_method,
'Most Diversified Assets Correlation Matrix - stacking Correlation Analysis/PCA Method')
print('\nMost Diversified Assets Log returns using Stacking Hierarchical Clustering, Correlation Analysis & PCA Method\n')
display(most_diversify_portfolio_assets_log_returns_df)
plotting_selected_assets_corr_mat_clustermap(most_diversify_portfolio_assets_corr_matrix,
'Most Diversified Assets Correlation Matrix - Stacking Hierarchical_clustering, Correlation Analysis & PCA Method')
print('\nDiversified Portfolio Assets Volatility \n')
display(selected_assets_volatility_df_stacking_corr_PCA_method)
plot_scatter_matrix(most_diversify_portfolio_assets_log_returns_df)
print('\nMost Diversified Portfolio arithmetics details\n')
display(most_divesified_portfolio_arihtmetics_df_details)
plot_portfolio_structure( most_divesified_portfolio_arihtmetics_df_details)
Initial assets log returns
| AEM | AGI | ATS | BLX | BMO | BN | BNS | BTE | BTO | BYD | ... | TD | TFII | TPZ | TRI | TRP | TVE | WCN | WFG | WPM | X | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||||||||||||||
| 2019-09-19 | 0.005959 | 0.028438 | 0.000000 | -0.005571 | 0.001910 | 0.010767 | 0.000534 | 0.018018 | -0.006152 | -0.014676 | ... | 0.005576 | 0.000000 | 0.000000 | 0.002668 | 0.004330 | 0.000000 | 0.003438 | 0.000000 | 0.009932 | -0.118386 |
| 2019-09-20 | 0.016807 | 0.015456 | -0.006543 | -0.003052 | 0.001362 | -0.004812 | 0.000889 | 0.046520 | -0.001235 | -0.019909 | ... | 0.002256 | 0.000000 | 0.004710 | -0.011914 | 0.015202 | 0.005486 | 0.003315 | 0.000000 | 0.002924 | -0.022863 |
| 2019-09-23 | 0.022427 | 0.019743 | 0.000000 | -0.002550 | -0.005186 | -0.014012 | 0.003372 | -0.005698 | -0.000927 | -0.008154 | ... | 0.000000 | 0.002291 | -0.001106 | 0.003440 | 0.005978 | -0.005879 | 0.003964 | -0.024681 | 0.024871 | 0.021053 |
| 2019-09-24 | 0.006364 | 0.008982 | -0.002922 | 0.002041 | -0.007554 | -0.010592 | 0.006709 | -0.064920 | -0.013699 | -0.022473 | ... | -0.003996 | 0.000000 | -0.005550 | 0.005657 | -0.002503 | -0.000786 | 0.003401 | 0.004599 | 0.012383 | -0.030347 |
| 2019-09-25 | -0.024847 | -0.053571 | -0.006606 | 0.012662 | 0.009195 | 0.008897 | 0.003864 | 0.018127 | 0.014626 | -0.008811 | ... | -0.002265 | 0.000000 | -0.000556 | 0.005183 | 0.000000 | -0.003942 | -0.003292 | 0.000000 | -0.036159 | 0.066812 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 0.011251 | 0.003309 | 0.032904 | 0.002256 | 0.008346 | 0.022295 | 0.015522 | -0.006536 | 0.004310 | 0.010163 | ... | 0.018057 | 0.002572 | 0.004999 | 0.014097 | 0.008106 | 0.003559 | 0.012948 | 0.004000 | 0.009954 | 0.048379 |
| 2024-09-10 | 0.014681 | 0.029302 | 0.003042 | -0.009706 | -0.001567 | 0.001483 | 0.003115 | -0.029952 | -0.006163 | -0.016714 | ... | -0.006529 | -0.013214 | 0.000000 | 0.016763 | -0.027570 | -0.001778 | -0.001731 | -0.002055 | 0.014411 | -0.049979 |
| 2024-09-11 | 0.002405 | 0.011167 | 0.006433 | -0.025685 | 0.017460 | 0.017833 | 0.006007 | 0.013423 | -0.001856 | -0.007953 | ... | 0.010587 | 0.030752 | -0.006113 | 0.003958 | 0.001964 | 0.001778 | 0.003837 | -0.008379 | -0.002359 | 0.067198 |
| 2024-09-12 | 0.034298 | 0.060360 | -0.011763 | 0.007311 | 0.009205 | 0.019594 | -0.000580 | 0.019803 | 0.007713 | 0.019516 | ... | 0.002751 | 0.001961 | 0.008879 | 0.009425 | 0.004567 | 0.000000 | 0.003231 | 0.017141 | 0.034818 | 0.039635 |
| 2024-09-13 | 0.015876 | 0.030923 | -0.018879 | 0.017723 | 0.005038 | 0.008339 | 0.005975 | -0.006557 | 0.005940 | 0.023286 | ... | 0.004996 | 0.000350 | 0.005510 | -0.006119 | 0.009931 | -0.000444 | -0.001668 | 0.025173 | 0.019205 | 0.037570 |
1255 rows × 79 columns
explained_variance_df
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | ... | PC70 | PC71 | PC72 | PC73 | PC74 | PC75 | PC76 | PC77 | PC78 | PC79 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.403539 | 0.100359 | 0.040445 | 0.03251 | 0.027595 | 0.021537 | 0.01613 | 0.015612 | 0.01322 | 0.01211 | ... | 0.001581 | 0.001549 | 0.001374 | 0.001331 | 0.001322 | 0.001241 | 0.000908 | 0.000862 | 0.000556 | 0.000306 |
1 rows × 79 columns
loadings_matrix_df
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | ... | PC70 | PC71 | PC72 | PC73 | PC74 | PC75 | PC76 | PC77 | PC78 | PC79 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AEM | -0.065669 | 0.277218 | -0.010885 | -0.037676 | -0.045264 | 0.033069 | -0.078202 | 0.056256 | -0.058732 | -0.033783 | ... | -0.151400 | 0.035509 | -0.126888 | -0.056302 | 0.066494 | 0.003332 | -0.003731 | -0.001027 | -0.009852 | 0.017048 |
| AGI | -0.058055 | 0.293144 | -0.008136 | -0.050313 | -0.048107 | 0.083242 | -0.011359 | -0.019832 | -0.021472 | 0.009000 | ... | 0.200848 | -0.142541 | 0.064375 | -0.078013 | -0.121092 | 0.010274 | -0.062080 | -0.021057 | -0.011170 | -0.018934 |
| ATS | -0.073845 | -0.003196 | 0.015798 | 0.123400 | 0.075143 | -0.140624 | 0.226966 | -0.039264 | -0.099206 | -0.295643 | ... | -0.008172 | 0.016202 | -0.001900 | 0.006657 | 0.003255 | -0.004879 | 0.021869 | -0.019858 | -0.006765 | 0.001025 |
| BLX | -0.105663 | -0.057798 | -0.041605 | 0.040873 | -0.132797 | 0.298437 | -0.075353 | -0.015556 | -0.119774 | 0.093928 | ... | 0.020047 | -0.048509 | 0.006170 | -0.041851 | -0.002258 | 0.009578 | -0.002888 | -0.048743 | 0.013112 | -0.015189 |
| BMO | -0.152284 | -0.072828 | -0.059178 | -0.050535 | 0.045357 | -0.063581 | 0.165802 | 0.092483 | -0.009710 | 0.068572 | ... | -0.065963 | 0.188700 | 0.066368 | -0.695366 | -0.294240 | -0.254406 | -0.047065 | -0.062295 | 0.034963 | -0.000135 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| TVE | -0.034162 | 0.019887 | 0.083680 | 0.052872 | -0.134708 | 0.027806 | -0.092218 | -0.048004 | 0.667672 | -0.072657 | ... | 0.015626 | -0.004245 | 0.017087 | -0.012480 | -0.006102 | 0.001055 | -0.010073 | -0.019879 | 0.007495 | 0.015122 |
| WCN | -0.100774 | -0.013572 | 0.166350 | -0.181940 | 0.083366 | -0.054607 | -0.242837 | 0.024668 | 0.023151 | 0.059248 | ... | -0.054925 | -0.039940 | 0.050932 | 0.003638 | -0.020457 | 0.050221 | -0.007624 | 0.002607 | -0.004426 | -0.014030 |
| WFG | -0.111501 | -0.000092 | -0.014839 | 0.011644 | 0.001620 | -0.054287 | 0.179568 | 0.018910 | 0.121690 | -0.226637 | ... | 0.018991 | -0.003166 | 0.008847 | -0.037237 | -0.000709 | -0.013008 | -0.008995 | -0.022354 | 0.004389 | 0.012087 |
| WPM | -0.069893 | 0.285385 | 0.016996 | -0.052283 | 0.026792 | -0.037374 | 0.019060 | -0.001279 | -0.071231 | -0.023222 | ... | 0.256976 | -0.020504 | -0.040127 | -0.189320 | -0.096679 | 0.058365 | -0.016540 | -0.006234 | -0.304184 | -0.020128 |
| X | -0.091522 | 0.001560 | -0.102661 | 0.066105 | 0.110150 | 0.183061 | 0.064658 | -0.219873 | 0.155924 | -0.025309 | ... | -0.011965 | -0.010316 | 0.013302 | 0.029585 | -0.013110 | 0.005908 | 0.001931 | -0.030683 | 0.000352 | 0.007323 |
79 rows × 79 columns
top_components_df
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | ... | PC70 | PC71 | PC72 | PC73 | PC74 | PC75 | PC76 | PC77 | PC78 | PC79 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AEM | -0.065669 | 0.277218 | -0.010885 | -0.037676 | -0.045264 | 0.033069 | -0.078202 | 0.056256 | -0.058732 | -0.033783 | ... | -0.151400 | 0.035509 | -0.126888 | -0.056302 | 0.066494 | 0.003332 | -0.003731 | -0.001027 | -0.009852 | 0.017048 |
| AGI | -0.058055 | 0.293144 | -0.008136 | -0.050313 | -0.048107 | 0.083242 | -0.011359 | -0.019832 | -0.021472 | 0.009000 | ... | 0.200848 | -0.142541 | 0.064375 | -0.078013 | -0.121092 | 0.010274 | -0.062080 | -0.021057 | -0.011170 | -0.018934 |
| ATS | -0.073845 | -0.003196 | 0.015798 | 0.123400 | 0.075143 | -0.140624 | 0.226966 | -0.039264 | -0.099206 | -0.295643 | ... | -0.008172 | 0.016202 | -0.001900 | 0.006657 | 0.003255 | -0.004879 | 0.021869 | -0.019858 | -0.006765 | 0.001025 |
| BLX | -0.105663 | -0.057798 | -0.041605 | 0.040873 | -0.132797 | 0.298437 | -0.075353 | -0.015556 | -0.119774 | 0.093928 | ... | 0.020047 | -0.048509 | 0.006170 | -0.041851 | -0.002258 | 0.009578 | -0.002888 | -0.048743 | 0.013112 | -0.015189 |
| BMO | -0.152284 | -0.072828 | -0.059178 | -0.050535 | 0.045357 | -0.063581 | 0.165802 | 0.092483 | -0.009710 | 0.068572 | ... | -0.065963 | 0.188700 | 0.066368 | -0.695366 | -0.294240 | -0.254406 | -0.047065 | -0.062295 | 0.034963 | -0.000135 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| TVE | -0.034162 | 0.019887 | 0.083680 | 0.052872 | -0.134708 | 0.027806 | -0.092218 | -0.048004 | 0.667672 | -0.072657 | ... | 0.015626 | -0.004245 | 0.017087 | -0.012480 | -0.006102 | 0.001055 | -0.010073 | -0.019879 | 0.007495 | 0.015122 |
| WCN | -0.100774 | -0.013572 | 0.166350 | -0.181940 | 0.083366 | -0.054607 | -0.242837 | 0.024668 | 0.023151 | 0.059248 | ... | -0.054925 | -0.039940 | 0.050932 | 0.003638 | -0.020457 | 0.050221 | -0.007624 | 0.002607 | -0.004426 | -0.014030 |
| WFG | -0.111501 | -0.000092 | -0.014839 | 0.011644 | 0.001620 | -0.054287 | 0.179568 | 0.018910 | 0.121690 | -0.226637 | ... | 0.018991 | -0.003166 | 0.008847 | -0.037237 | -0.000709 | -0.013008 | -0.008995 | -0.022354 | 0.004389 | 0.012087 |
| WPM | -0.069893 | 0.285385 | 0.016996 | -0.052283 | 0.026792 | -0.037374 | 0.019060 | -0.001279 | -0.071231 | -0.023222 | ... | 0.256976 | -0.020504 | -0.040127 | -0.189320 | -0.096679 | 0.058365 | -0.016540 | -0.006234 | -0.304184 | -0.020128 |
| X | -0.091522 | 0.001560 | -0.102661 | 0.066105 | 0.110150 | 0.183061 | 0.064658 | -0.219873 | 0.155924 | -0.025309 | ... | -0.011965 | -0.010316 | 0.013302 | 0.029585 | -0.013110 | 0.005908 | 0.001931 | -0.030683 | 0.000352 | 0.007323 |
79 rows × 79 columns
Most important assets with top components
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | ... | PC70 | PC71 | PC72 | PC73 | PC74 | PC75 | PC76 | PC77 | PC78 | PC79 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AGI | -0.058055 | 0.293144 | -0.008136 | -0.050313 | -0.048107 | 0.083242 | -0.011359 | -0.019832 | -0.021472 | 0.009000 | ... | 0.200848 | -0.142541 | 0.064375 | -0.078013 | -0.121092 | 0.010274 | -0.062080 | -0.021057 | -0.011170 | -0.018934 |
| ATS | -0.073845 | -0.003196 | 0.015798 | 0.123400 | 0.075143 | -0.140624 | 0.226966 | -0.039264 | -0.099206 | -0.295643 | ... | -0.008172 | 0.016202 | -0.001900 | 0.006657 | 0.003255 | -0.004879 | 0.021869 | -0.019858 | -0.006765 | 0.001025 |
| BMO | -0.152284 | -0.072828 | -0.059178 | -0.050535 | 0.045357 | -0.063581 | 0.165802 | 0.092483 | -0.009710 | 0.068572 | ... | -0.065963 | 0.188700 | 0.066368 | -0.695366 | -0.294240 | -0.254406 | -0.047065 | -0.062295 | 0.034963 | -0.000135 |
| BN | -0.147581 | -0.049083 | 0.067809 | -0.010327 | 0.068137 | -0.009248 | 0.077914 | 0.100975 | 0.069761 | 0.030375 | ... | -0.024295 | 0.050163 | -0.147035 | 0.029083 | 0.012623 | -0.131303 | -0.120203 | -0.041728 | 0.020601 | 0.002664 |
| CIX | -0.055727 | 0.012151 | 0.005574 | 0.033036 | -0.082174 | 0.201782 | -0.029220 | -0.161301 | 0.198752 | 0.390850 | ... | -0.007709 | -0.004708 | 0.004186 | 0.011902 | -0.018790 | -0.012573 | 0.003734 | -0.008440 | -0.019901 | -0.001135 |
| CNQ | -0.129044 | -0.052601 | -0.266303 | 0.050060 | 0.038974 | -0.130717 | -0.204224 | -0.042648 | -0.015670 | -0.005193 | ... | 0.210442 | 0.028941 | 0.017754 | 0.194277 | 0.003770 | -0.664877 | -0.024434 | 0.167211 | 0.073147 | -0.007912 |
| DOL | -0.160518 | -0.010954 | 0.042704 | -0.028079 | 0.084447 | -0.018499 | 0.009780 | -0.076932 | -0.108125 | 0.070747 | ... | -0.011061 | 0.050432 | -0.002232 | 0.013674 | 0.011408 | -0.017905 | -0.054481 | -0.036818 | -0.006025 | 0.750246 |
| DOO | -0.157289 | -0.010757 | 0.036949 | -0.038733 | 0.075571 | -0.001145 | 0.031621 | -0.091489 | -0.088259 | 0.055132 | ... | -0.055132 | 0.074028 | -0.024020 | 0.007641 | 0.047442 | -0.035191 | 0.022945 | 0.006520 | -0.044958 | -0.650055 |
| ENB | -0.144415 | -0.040819 | -0.102101 | -0.085299 | -0.051066 | -0.156501 | -0.081331 | 0.078633 | -0.006538 | 0.127160 | ... | -0.025530 | -0.589933 | 0.005431 | -0.129889 | -0.084892 | -0.052072 | -0.082099 | -0.049808 | 0.037564 | -0.014443 |
| IGM | -0.127941 | -0.005348 | 0.247802 | 0.140876 | 0.187187 | -0.037198 | -0.125438 | -0.093726 | -0.045313 | 0.087582 | ... | 0.074561 | 0.000017 | -0.096667 | 0.006884 | -0.005883 | -0.111989 | 0.688758 | -0.389601 | 0.034563 | 0.007848 |
| NGD | -0.062664 | 0.234990 | -0.036112 | -0.013709 | -0.009670 | 0.062311 | 0.012907 | 0.014675 | 0.025940 | 0.013104 | ... | -0.030877 | -0.015453 | 0.014054 | 0.012862 | 0.023706 | -0.027772 | -0.027923 | 0.001248 | -0.021424 | 0.010592 |
| PEY | -0.149404 | -0.056981 | -0.021802 | -0.196603 | 0.035194 | 0.165845 | 0.021249 | -0.125862 | 0.031806 | -0.043429 | ... | -0.052847 | -0.053877 | -0.111617 | -0.135503 | 0.023964 | 0.078462 | 0.429540 | 0.743572 | -0.106402 | 0.063064 |
| RY | -0.152320 | -0.051308 | -0.021665 | -0.109855 | 0.021532 | -0.048933 | 0.157040 | 0.052900 | 0.019120 | 0.112541 | ... | 0.295141 | 0.135316 | 0.403495 | 0.063244 | 0.502920 | -0.047708 | 0.004983 | 0.080169 | -0.040179 | -0.001603 |
| SIL | -0.084934 | 0.290093 | -0.017374 | 0.014017 | -0.016193 | 0.030938 | 0.047216 | -0.032776 | -0.051777 | -0.011255 | ... | 0.020233 | 0.049425 | -0.035915 | 0.007104 | 0.017996 | 0.118807 | 0.029920 | 0.099958 | 0.842134 | -0.022465 |
| TD | -0.148354 | -0.069229 | -0.062435 | -0.100556 | 0.066975 | -0.042147 | 0.179859 | 0.016946 | -0.013172 | 0.087252 | ... | 0.022431 | 0.039052 | -0.080281 | 0.324628 | -0.353065 | 0.077180 | -0.011293 | 0.042773 | -0.001475 | -0.024636 |
| TVE | -0.034162 | 0.019887 | 0.083680 | 0.052872 | -0.134708 | 0.027806 | -0.092218 | -0.048004 | 0.667672 | -0.072657 | ... | 0.015626 | -0.004245 | 0.017087 | -0.012480 | -0.006102 | 0.001055 | -0.010073 | -0.019879 | 0.007495 | 0.015122 |
16 rows × 79 columns
Most Important Assets Log returns using PCA
| AGI | ATS | BMO | BN | CIX | CNQ | DOL | DOO | ENB | IGM | NGD | PEY | RY | SIL | TD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | ||||||||||||||||
| 2019-09-19 | 0.028438 | 0.000000 | 0.001910 | 0.010767 | -0.005358 | 0.001106 | 0.001718 | 0.000834 | 0.000284 | 0.002345 | 0.000000 | -0.003262 | 0.006974 | 0.006826 | 0.005576 | 0.000000 |
| 2019-09-20 | 0.015456 | -0.006543 | 0.001362 | -0.004812 | 0.000000 | 0.010625 | -0.000859 | -0.002021 | 0.003970 | -0.011009 | -0.050010 | -0.002179 | 0.009018 | 0.017198 | 0.002256 | 0.005486 |
| 2019-09-23 | 0.019743 | 0.000000 | -0.005186 | -0.014012 | -0.001344 | 0.006900 | -0.002149 | -0.003786 | -0.005391 | 0.000182 | 0.105361 | 0.002897 | -0.004067 | 0.029322 | 0.000000 | -0.005879 |
| 2019-09-24 | 0.008982 | -0.002922 | -0.007554 | -0.010592 | -0.046104 | -0.017894 | -0.005535 | -0.005219 | 0.006239 | -0.014626 | 0.030305 | -0.006572 | -0.004208 | 0.012903 | -0.003996 | -0.000786 |
| 2019-09-25 | -0.053571 | -0.006606 | 0.009195 | 0.008897 | 0.023661 | -0.006655 | -0.002179 | -0.002736 | -0.001698 | 0.013147 | -0.085655 | 0.007663 | 0.004578 | -0.043564 | -0.002265 | -0.003942 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 0.003309 | 0.032904 | 0.008346 | 0.022295 | 0.078256 | 0.007794 | 0.009506 | 0.009678 | 0.009855 | 0.010638 | 0.017022 | 0.005634 | 0.016810 | 0.006941 | 0.018057 | 0.003559 |
| 2024-09-10 | 0.029302 | 0.003042 | -0.001567 | 0.001483 | 0.001090 | -0.038970 | -0.007407 | -0.006583 | -0.013079 | 0.009850 | 0.049393 | 0.000000 | -0.005790 | 0.014389 | -0.006529 | -0.001778 |
| 2024-09-11 | 0.011167 | 0.006433 | 0.017460 | 0.017833 | 0.019425 | 0.005573 | 0.003615 | 0.002253 | -0.000497 | 0.024374 | 0.058496 | -0.008463 | 0.008226 | 0.021202 | 0.010587 | 0.001778 |
| 2024-09-12 | 0.060360 | -0.011763 | 0.009205 | 0.019594 | -0.009665 | 0.007689 | 0.009640 | 0.004162 | 0.006194 | 0.009738 | 0.090478 | 0.003770 | 0.005501 | 0.063724 | 0.002751 | 0.000000 |
| 2024-09-13 | 0.030923 | -0.018879 | 0.005038 | 0.008339 | 0.021001 | -0.008813 | 0.002067 | 0.005259 | 0.005910 | 0.006512 | 0.070146 | 0.015403 | -0.002989 | 0.048606 | 0.004996 | -0.000444 |
1255 rows × 16 columns
Most Important Assets Log returns using Correlation method
| AGI | ATS | BMO | BN | CIX | CNQ | DOL | DOO | ENB | IGM | NGD | PEY | RY | SIL | TD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | ||||||||||||||||
| 2019-09-19 | 0.028438 | 0.000000 | 0.001910 | 0.010767 | -0.005358 | 0.001106 | 0.001718 | 0.000834 | 0.000284 | 0.002345 | 0.000000 | -0.003262 | 0.006974 | 0.006826 | 0.005576 | 0.000000 |
| 2019-09-20 | 0.015456 | -0.006543 | 0.001362 | -0.004812 | 0.000000 | 0.010625 | -0.000859 | -0.002021 | 0.003970 | -0.011009 | -0.050010 | -0.002179 | 0.009018 | 0.017198 | 0.002256 | 0.005486 |
| 2019-09-23 | 0.019743 | 0.000000 | -0.005186 | -0.014012 | -0.001344 | 0.006900 | -0.002149 | -0.003786 | -0.005391 | 0.000182 | 0.105361 | 0.002897 | -0.004067 | 0.029322 | 0.000000 | -0.005879 |
| 2019-09-24 | 0.008982 | -0.002922 | -0.007554 | -0.010592 | -0.046104 | -0.017894 | -0.005535 | -0.005219 | 0.006239 | -0.014626 | 0.030305 | -0.006572 | -0.004208 | 0.012903 | -0.003996 | -0.000786 |
| 2019-09-25 | -0.053571 | -0.006606 | 0.009195 | 0.008897 | 0.023661 | -0.006655 | -0.002179 | -0.002736 | -0.001698 | 0.013147 | -0.085655 | 0.007663 | 0.004578 | -0.043564 | -0.002265 | -0.003942 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 0.003309 | 0.032904 | 0.008346 | 0.022295 | 0.078256 | 0.007794 | 0.009506 | 0.009678 | 0.009855 | 0.010638 | 0.017022 | 0.005634 | 0.016810 | 0.006941 | 0.018057 | 0.003559 |
| 2024-09-10 | 0.029302 | 0.003042 | -0.001567 | 0.001483 | 0.001090 | -0.038970 | -0.007407 | -0.006583 | -0.013079 | 0.009850 | 0.049393 | 0.000000 | -0.005790 | 0.014389 | -0.006529 | -0.001778 |
| 2024-09-11 | 0.011167 | 0.006433 | 0.017460 | 0.017833 | 0.019425 | 0.005573 | 0.003615 | 0.002253 | -0.000497 | 0.024374 | 0.058496 | -0.008463 | 0.008226 | 0.021202 | 0.010587 | 0.001778 |
| 2024-09-12 | 0.060360 | -0.011763 | 0.009205 | 0.019594 | -0.009665 | 0.007689 | 0.009640 | 0.004162 | 0.006194 | 0.009738 | 0.090478 | 0.003770 | 0.005501 | 0.063724 | 0.002751 | 0.000000 |
| 2024-09-13 | 0.030923 | -0.018879 | 0.005038 | 0.008339 | 0.021001 | -0.008813 | 0.002067 | 0.005259 | 0.005910 | 0.006512 | 0.070146 | 0.015403 | -0.002989 | 0.048606 | 0.004996 | -0.000444 |
1255 rows × 16 columns
Most Diversified Assets Log returns using Stack Correlation Matrix/PCA Method
| AGI | ATS | BMO | BN | CIX | CNQ | DOL | DOO | ENB | IGM | NGD | PEY | RY | SIL | TD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | ||||||||||||||||
| 2019-09-19 | 0.028438 | 0.000000 | 0.001910 | 0.010767 | -0.005358 | 0.001106 | 0.001718 | 0.000834 | 0.000284 | 0.002345 | 0.000000 | -0.003262 | 0.006974 | 0.006826 | 0.005576 | 0.000000 |
| 2019-09-20 | 0.015456 | -0.006543 | 0.001362 | -0.004812 | 0.000000 | 0.010625 | -0.000859 | -0.002021 | 0.003970 | -0.011009 | -0.050010 | -0.002179 | 0.009018 | 0.017198 | 0.002256 | 0.005486 |
| 2019-09-23 | 0.019743 | 0.000000 | -0.005186 | -0.014012 | -0.001344 | 0.006900 | -0.002149 | -0.003786 | -0.005391 | 0.000182 | 0.105361 | 0.002897 | -0.004067 | 0.029322 | 0.000000 | -0.005879 |
| 2019-09-24 | 0.008982 | -0.002922 | -0.007554 | -0.010592 | -0.046104 | -0.017894 | -0.005535 | -0.005219 | 0.006239 | -0.014626 | 0.030305 | -0.006572 | -0.004208 | 0.012903 | -0.003996 | -0.000786 |
| 2019-09-25 | -0.053571 | -0.006606 | 0.009195 | 0.008897 | 0.023661 | -0.006655 | -0.002179 | -0.002736 | -0.001698 | 0.013147 | -0.085655 | 0.007663 | 0.004578 | -0.043564 | -0.002265 | -0.003942 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 0.003309 | 0.032904 | 0.008346 | 0.022295 | 0.078256 | 0.007794 | 0.009506 | 0.009678 | 0.009855 | 0.010638 | 0.017022 | 0.005634 | 0.016810 | 0.006941 | 0.018057 | 0.003559 |
| 2024-09-10 | 0.029302 | 0.003042 | -0.001567 | 0.001483 | 0.001090 | -0.038970 | -0.007407 | -0.006583 | -0.013079 | 0.009850 | 0.049393 | 0.000000 | -0.005790 | 0.014389 | -0.006529 | -0.001778 |
| 2024-09-11 | 0.011167 | 0.006433 | 0.017460 | 0.017833 | 0.019425 | 0.005573 | 0.003615 | 0.002253 | -0.000497 | 0.024374 | 0.058496 | -0.008463 | 0.008226 | 0.021202 | 0.010587 | 0.001778 |
| 2024-09-12 | 0.060360 | -0.011763 | 0.009205 | 0.019594 | -0.009665 | 0.007689 | 0.009640 | 0.004162 | 0.006194 | 0.009738 | 0.090478 | 0.003770 | 0.005501 | 0.063724 | 0.002751 | 0.000000 |
| 2024-09-13 | 0.030923 | -0.018879 | 0.005038 | 0.008339 | 0.021001 | -0.008813 | 0.002067 | 0.005259 | 0.005910 | 0.006512 | 0.070146 | 0.015403 | -0.002989 | 0.048606 | 0.004996 | -0.000444 |
1255 rows × 16 columns
Most Diversified Assets Log returns using Stacking Hierarchical Clustering, Correlation Analysis & PCA Method
| IGM | CNQ | DOL | DOO | BN | PEY | ENB | BMO | TD | NGD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||||
| 2019-09-19 | 0.002345 | 0.001106 | 0.001718 | 0.000834 | 0.010767 | -0.003262 | 0.000284 | 0.001910 | 0.005576 | 0.000000 | 0.000000 |
| 2019-09-20 | -0.011009 | 0.010625 | -0.000859 | -0.002021 | -0.004812 | -0.002179 | 0.003970 | 0.001362 | 0.002256 | -0.050010 | 0.005486 |
| 2019-09-23 | 0.000182 | 0.006900 | -0.002149 | -0.003786 | -0.014012 | 0.002897 | -0.005391 | -0.005186 | 0.000000 | 0.105361 | -0.005879 |
| 2019-09-24 | -0.014626 | -0.017894 | -0.005535 | -0.005219 | -0.010592 | -0.006572 | 0.006239 | -0.007554 | -0.003996 | 0.030305 | -0.000786 |
| 2019-09-25 | 0.013147 | -0.006655 | -0.002179 | -0.002736 | 0.008897 | 0.007663 | -0.001698 | 0.009195 | -0.002265 | -0.085655 | -0.003942 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 0.010638 | 0.007794 | 0.009506 | 0.009678 | 0.022295 | 0.005634 | 0.009855 | 0.008346 | 0.018057 | 0.017022 | 0.003559 |
| 2024-09-10 | 0.009850 | -0.038970 | -0.007407 | -0.006583 | 0.001483 | 0.000000 | -0.013079 | -0.001567 | -0.006529 | 0.049393 | -0.001778 |
| 2024-09-11 | 0.024374 | 0.005573 | 0.003615 | 0.002253 | 0.017833 | -0.008463 | -0.000497 | 0.017460 | 0.010587 | 0.058496 | 0.001778 |
| 2024-09-12 | 0.009738 | 0.007689 | 0.009640 | 0.004162 | 0.019594 | 0.003770 | 0.006194 | 0.009205 | 0.002751 | 0.090478 | 0.000000 |
| 2024-09-13 | 0.006512 | -0.008813 | 0.002067 | 0.005259 | 0.008339 | 0.015403 | 0.005910 | 0.005038 | 0.004996 | 0.070146 | -0.000444 |
1255 rows × 11 columns
Diversified Portfolio Assets Volatility
| IGM Volatility | CNQ Volatility | DOL Volatility | DOO Volatility | BN Volatility | PEY Volatility | ENB Volatility | BMO Volatility | TD Volatility | NGD Volatility | TVE Volatility | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||||
| 2020-09-17 | 0.360603 | 0.810664 | 0.308874 | 0.298213 | 0.511776 | 0.388163 | 0.472484 | 0.495009 | 0.436419 | 0.851092 | 0.106949 |
| 2020-09-18 | 0.360840 | 0.810868 | 0.308948 | 0.298232 | 0.511690 | 0.388285 | 0.472608 | 0.495128 | 0.436527 | 0.851180 | 0.107394 |
| 2020-09-21 | 0.360633 | 0.812542 | 0.309930 | 0.299506 | 0.512432 | 0.389360 | 0.472802 | 0.495544 | 0.437326 | 0.851868 | 0.107525 |
| 2020-09-22 | 0.361150 | 0.812808 | 0.309925 | 0.299597 | 0.512336 | 0.389364 | 0.473020 | 0.495678 | 0.437342 | 0.846089 | 0.107430 |
| 2020-09-23 | 0.362196 | 0.813346 | 0.310158 | 0.299681 | 0.512973 | 0.390019 | 0.474124 | 0.495760 | 0.437471 | 0.854508 | 0.107572 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-09 | 0.213149 | 0.280058 | 0.122660 | 0.121811 | 0.292684 | 0.163629 | 0.161075 | 0.226355 | 0.186543 | 0.573182 | 0.084597 |
| 2024-09-10 | 0.213319 | 0.282738 | 0.122913 | 0.122023 | 0.292288 | 0.163627 | 0.161460 | 0.226361 | 0.186667 | 0.575011 | 0.084177 |
| 2024-09-11 | 0.214513 | 0.281679 | 0.122394 | 0.121479 | 0.292176 | 0.163795 | 0.161413 | 0.226418 | 0.185932 | 0.577119 | 0.084182 |
| 2024-09-12 | 0.213909 | 0.280979 | 0.122713 | 0.121532 | 0.292604 | 0.163765 | 0.161069 | 0.226494 | 0.185897 | 0.583353 | 0.084043 |
| 2024-09-13 | 0.213971 | 0.281040 | 0.122721 | 0.121607 | 0.292210 | 0.164200 | 0.161088 | 0.226239 | 0.185794 | 0.586887 | 0.084009 |
1004 rows × 11 columns
Most Diversified Portfolio arithmetics details
| Ticker | mu expected_return | variance | Sigmas(volatilities) | modifiy shape(Er)/𝝈 | initial price | Company | Sector | Industry | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | IGM | 0.000747 | 0.000307 | 0.017516 | 0.042633 | 36.212482 | IGM Financial Inc. | Financial Services | Asset Management |
| 1 | CNQ | 0.000925 | 0.000943 | 0.030710 | 0.030125 | 10.011539 | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production |
| 2 | DOL | 0.000257 | 0.000147 | 0.012131 | 0.021183 | 38.585945 | Dollarama Inc. | Consumer Defensive | Retail Defensive |
| 3 | DOO | 0.000143 | 0.000144 | 0.012008 | 0.011938 | 36.010262 | BRP Inc. | Consumer Cyclical | Vehicles & Parts |
| 4 | BN | 0.000468 | 0.000511 | 0.022599 | 0.020709 | 27.440422 | Brookfield Corporation | Financial Services | Asset Management |
| 5 | PEY | 0.000306 | 0.000216 | 0.014697 | 0.020790 | 14.713406 | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production |
| 6 | ENB | 0.000373 | 0.000308 | 0.017539 | 0.021273 | 25.500664 | Enbridge Inc. | Energy | Oil & Gas Storage/Transport |
| 7 | BMO | 0.000303 | 0.000350 | 0.018704 | 0.016185 | 58.515537 | Bank of Montreal | Financial Services | Banks |
| 8 | TD | 0.000243 | 0.000282 | 0.016793 | 0.014462 | 45.858364 | Toronto-Dominion Bank | Financial Services | Banks |
| 9 | NGD | 0.000737 | 0.001866 | 0.043194 | 0.017052 | 1.230000 | New Gold Inc. | Basic Materials | Metals & Mining |
| 10 | TVE | 0.000003 | 0.000047 | 0.006884 | 0.000416 | 22.429239 | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production |
Before going forward with our analysis, it is crucial to understand the form of the data distribution, here stock price and asset returns distribution over time in other to choose the appropriate model. The daily adjust closed prices charts above show that the stock prices movement and their returns over time folow an independent random process, with stock prices always positive and ncrease indefinitely. The distribution exhibits positive skewnes.The future price movement doesn't depend on its history, but it is determined by both its current state and some inherent randomness such as economic indicators, company performance, investor sentiment, geopolitical events, and unforeseen news. Therefore, the stock prices are considered stochastic. In general we don’t know the distribution of the stock prices, we only know that it is closed to the brownan motion stockastic process. Typically, the logarithm of the stock price follows a Brownian motion with drift.
Stochastic differential equation (SDE) of the stock price 𝑆(𝑡): $$ dS(t) = \mu S(t) \, d(t) + \sigma S(t) \, dW(t) $$ where:
In a simple term, $dS(t) = S(t+d(t))$ , with $S(t)$ representing the stock price at the time $t$.
logarithmicly the continuous compounding return of the stock over the interval $d(t)$ is $ r(t) = \log\left(\frac{S(t+d(t))}{S(t)}\right)$.
The volatility $\sigma$ is the square root of the variance. It provides a measure of the risk or uncertainty of the stock price. $\sigma$ is a key parameter of the Geometric Brownian motion that determines the stochastic variation of the stock price.
The variance of the stock price over a given period is a measure of the magnitude of expected price fluctuations. It is the mathematical expectation of the squared deviation between the price and its mean.
$W(t)$ is a random variable that follows a Wiener process. It is the random component of the stock price movement and is related to the variation of time d(t). The mathematical expression of this relationship is : $dW(t)=\epsilon \sqrt{dt}$.
In this expression, the term "epsilon" represents a random variable whose distribution is normal with expected value of zero(mean zero) and variance equal 1. It's mathematical expection is $E(dW(t))=\sqrt{dt}E(epsilon)=0$ with the variance $Var(dW(t))=d(t)Var(epsilon)=d(t)$.
The variations of $W(t)$ are independent over time. In the case the company associated with that stock does not distribute a portion of its profits to shareholders in the form of dividend payments, the stochastic equation for the return of a stock is:$\frac{dS(t)}{S(t)} = \mu d(t) + \sigma dW(t)$.
Therefore, it makes sense that the return on a stock does not depend on the price of the stock.
Let's now focus on the stochastic equation for the return. We will dig into this yow part : $\mu d(t) and \sigma dW(t)$.
The first part, $\mu d(t)$, is deterministic meaning that, using the historical data, we can calculate the expected change in the stock price over the small time interval $dt$, assuming no randomness. Essentially, $mu$ is the expected rate of return per unit time, and when multiplied by the stock price $S(t)$ and the time interval $dt$, it gives the expected change in the stock price due to predictable factors like steady growth, interest rates, or dividends.
Tthe second part, $\sigma dW(t)$, is the Stochastic or randomness part of the stock rate of return. It takes in to considaration the unpredictable fluctuations in the stock price due to various factors like market volatility, company-Specific news, or economic factors, geopolitical events,natural disasters and pandemic, investor behavior and sentiment, technological advances and disruptions, global economic interdependencies. The $dW(t)$ represents the random shock to the stock price, and $sigma$ scales this shock, making it more or less volatile. Let's look inside the solution of the stochatic equation of the stock price.
Quation1: $S(t) = S(0) \exp \left( \left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W(t) \right)$
or in more details
Quation2: $S(t) = S(0) \exp \left( \left(\mu - \frac{\sigma^2}{2}\right)dt + \sigma \phi \sqrt{dt} \right)$
In the equation2, the term $ \phi$ represent correlated normal distributions with standard deviation equal 1 and expected value of zero(mean zero). But from Geometric Brownian Motion prostective, the stock price movement is independent over time(uncorrelated) and follow a log-normal distribution with a mean of zero and a standard deviation that depends on the time interval dt. In order to come out of this situation, we will procide as follow:
After then, we will simulate the portfolio Profit & Loss and finanly we will calculate the portfolio VaR(value at Rick and the CVaR(conditional Value at Risk)
Mathematical Formula:
$$
Cov(X, Y) = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y})
$$
Where:
Mathematical Formula: $$ Correlation(X, Y) = [ \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} ] $$ Where:
Where:
The following is the process to predict the portfolio Value At RIsk(VaR and Conditional Valut At Risk(CVaR) using Monte Carlo Simulation
1. Uncorrelated Normal Distribution Simulation
An uncorrelated normal distribution describes a situation where random variables follow a normal distribution and have no linear relationship, resulting in a correlation of zero."
Why will we use uncorrelated normal distributions? As noted in the Exploratory Data Analysis, real-world stock prices move
independently and are non-stationary, meaning they generally increase over time, making the distribution time-dependent. Unlike normal
distributions, which are stationary, stock prices often show more extreme values, or 'fat tails,' indicating larger price changes than
a normal distribution would predict. Thus, uncorrelated normal distributions are used as a baseline to simulate real-world stock price
movements and to better undestand the tail risks. How to minimize the fat-tail risk?
2. Correlated Normal Distribution using Cholrsky Decomposition
This problem can be solved by applying Cholesky Decomposition process to the asset covarience matrix and the uncorelated normal distributions of the asset log returns.
Normal Distribution: A symmetrical, bell-shaped distribution centered around the mean (μ), characterized by its mean and standard deviation (σ)
Uncorrelated Variables: Random variables with zero covariance, indicating no linear relationship. However, they are not necessarily independent unless normally distributed.
3. Daily Returns Simulation
$Daily _returns = e^{\left( \left(\mu - \frac{\sigma^2}{2}\right)dt + \sigma \phi \sqrt{dt} \right)}$
In this formula $ \phi$ represent correlated normal distributions scaled by $sqrt{dt}$
4. Future Stock Price Simulation
$S(t) = S(0) \exp \left( \left(\mu - \frac{\sigma^2}{2}\right)dt + \sigma \phi \sqrt{dt} \right)$ $S(0)$c is the initial asset price
5. Portfolio Price Simulation
P(t) is the sum of each asset's price multiplied by its weight. If there are 𝑛 assets, the portfolio price at time 𝑡 is calculated by adding up the weighted prices of all the assets.
$( P(t) = \sum_{i=1}^{n} w_i S_i(t)$
Where:
$w_i$ is the weight of asset 𝑖 in the portfolio
$S_i(t)$ is the simulated price of asset𝑖 at time𝑡.
6. Portflio Profit & Loss Simulation
Tthe Portflio Profit & Loss Simulation difference between the current asset prices and their initial prices, multiplied by their respective weights, to get the overall portfolio P&L.
$\text{P&L}(t) = \sum_{i=1}^{n} w_i \times S_i(t) - S_i(0)$ Where:
$\text{P&L}(t)$ is the profit or loss of the portfolio at time
$wi$ is the weight of asset 𝑖 in the portfolio.
𝑆 (t) is the price of asset 𝑖 at time 𝑡.
𝑆𝑖(0) is the initial price of asset 𝑖 (at time 0).
𝑛 is the number of assets in the portfolio
7. VaR and CVaR calculation
VaR representing the maximum loss at a given confidence level
$\text{VaR}_\alpha = - \inf \{ x \in \mathbb{R} : F(x) > \alpha \}$
Where:
$\alpha$ is the confidence level (e.g., 0.95 or 0.99).
$F(x)$ is the cumulative distribution function of portfolio losses.
CVaR providing the average of the losses exceeding VaR.
$\text{CVaR}_\alpha = \mathbb{E}[ X | X \leq \text{VaR}_\alpha ]$
Where:
$\mathbb{E}[ X | X \leq \text{VaR}_\alpha ]$is the expected loss given that the loss $𝐿$ exceeds the $VaR$ at confidence level $α$.
def plotting_heatmap_for_correlation_matrix(log_returns, title):
plt.figure(figsize=(20, 8))
#sns.heatmap(log_returns.corr(), annot=True)
sns.heatmap(log_returns.corr(), annot=True, cmap='coolwarm', vmin=-1, vmax=1, linewidths=0.5, fmt=".2f")
plt.yticks(rotation=360)
plt.title(title, pad= 20)
def plotting_heatmap_for_covariance_matrix(covariance_matrix, title):
plt.figure(figsize=(20, 8))
sns.heatmap(covariance_matrix, annot=True, cmap='coolwarm', fmt=".5f",
linewidths=0.5, vmin=covariance_matrix.min().min(), vmax=covariance_matrix.max().max())
plt.yticks(rotation=360)
plt.title(title, pad= 20)
def variance_covariance_matrix(log_returns):
return log_returns.cov()
#-----------------------------------------------------------------------------------------------------------------------------------------------
#create cholesky matrice: let's apply cholesky decomposition to the covarience matrix
# Input: covarience matrice
# output: cholesky matrice data frame
#------------------------------------------------------------------------------------------------------------------------------------------------
def create_cholesky_matrix(covar_mat):
cholesky_matrix_data = np.linalg.cholesky(covar_mat)
return pd.DataFrame(cholesky_matrix_data[0:,0:], columns=covar_mat.columns.tolist(), index=covar_mat.columns.tolist())
#--------------------------------------------------------------------------------------------------------
# here let's simulate 10000 uncorelated normal distribution iterations to calculate the stock price.
#input:covariance matrice and number of iteration
#output 10000 Z score for each stock price: uncorelated normal z core array and it's data frame
# here we simulate 10000 uncorelated normal distribution iterations to calculate the stock price.
#t_intervals = 250
#number_of_assets = len(covar_mat.columns.tolist())
#Z = norm.ppf(np.random.rand(iterations,number_of_assets ))
#--------------------------------------------------------------------------------------------------------
def simulate_uncorelated_normal_distribution(covar_mat,iterations):
number_of_assets = len(covar_mat.columns.tolist())
#z score array
Z = norm.ppf(np.random.rand(iterations,number_of_assets ))
Z_df = pd.DataFrame(data=Z[0:,0:],index=[i for i in range(Z.shape[0])], columns=covar_mat.columns.tolist())
return Z,Z_df
#-----------------------------------------------------------------------------------------------------
#Description: generate correlated normal distribution using transposed cholesky matrix and uncorelated
#normal Z score distribution
#=MMULT(unCorrelated_normal_distribution,TRANSPOSE(cholesky_matrix))
#------------------------------------------------------------------------------------------------------
def generate_correlated_normal_distribution(cholesky_matrix_data_df,Z):
Correlated_Normals_Z = np.matmul(Z, cholesky_matrix_data_df.T)
Correlated_Normals_Z_arr = np.array(Correlated_Normals_Z)
return Correlated_Normals_Z, Correlated_Normals_Z_arr
#--------------------------------------------------------------------------------------------------------------
#Description: Daily returns simulation (returns simulation = 𝒆^(((𝝁𝒊−(𝟏/𝟐)𝝈𝒊𝟐)(𝒕𝟐−𝒕𝟏)+𝝈𝒊√((𝒕𝟐−𝒕𝟏) ) 𝝓)))
#Inputs:
# 𝝓 : Correlated_Normals_Z
# 𝝓_arr : Correlated_Normals_Z_array
# 𝝁 : log_returns.mean()
# variance: log_returns.var()
# 𝝈 : log_returns.std()
# 𝓢1 : initial_prices
# delta_t = 1
#output:
#--------------------------------------------------------------------------------------------------------------
def simulate_daily_returns(𝝓,𝝓_arr, 𝝁,𝝈,delta_t):
daily_returns_list_df = np.zeros_like(𝝓_arr)
daily_returns_list_df = np.exp((𝝁 - 0.5 * 𝝈** 2) * delta_t + 𝝈* delta_t ** 0.5 *𝝓)
return daily_returns_list_df
#--------------------------------------------------------------------------------------------------------------------------
# Description: Stock price simulation
def stock_prices_simulation(initial_prices,daily_returns_list_df ):
𝓢1_list = []
𝓢1_list = initial_prices.values
expo_r = daily_returns_list_df
future_stock_price_list_df= pd.DataFrame(data=daily_returns_list_df[0:0:],
index=[i for i in range(daily_returns_list_df.shape[0])],
columns=expo_r.columns.tolist())
for (index, column) in enumerate(expo_r):
future_stock_price_list_df[column] = pd.DataFrame(data=𝓢1_list[index]*expo_r[column].values)
return future_stock_price_list_df
#Initial portfolio price
def calculate_initial_portfolio_price(𝓢1):
return 𝓢1.values.sum()
#Portfolio price simulation
def simulated_portfolio_price(future_stock_price_df):
simulated_portfolio_price_row_sum = []
for i in range(len(future_stock_price_df)):
simulated_portfolio_price_row_sum.append(future_stock_price_df.iloc[i].sum())
simulated_portfolio_price_df = pd.DataFrame(data = simulated_portfolio_price_row_sum, columns=['portfolio_prices'])
return simulated_portfolio_price_df
#-----------------------------------------------------------------------------------------------------------
#Description: Portfolio Profit and loss calculation
# input:simulated_portfolio_price_df,portfolio_initial_price
# output :portfolio_profit_and_loss_df
#-----------------------------------------------------------------------------------------------------------
def calculate_prtfolio_profit_and_loss(simulated_portfolio_price_df, portfolio_initial_price):
portfolio_profit_and_loss_df = simulated_portfolio_price_df - portfolio_initial_price
portfolio_profit_and_loss_df.columns = ['profit_&_lost']
return portfolio_profit_and_loss_df
def set_portfolio_price_profit_and_Loss_simulation_df(simulated_portfolio_price_df, portfolio_profit_and_loss_df):
return pd.DataFrame({'simulated_portfolio_price': simulated_portfolio_price_df['portfolio_prices'].values,
'Simulated Portfolio Profit & Lost': portfolio_profit_and_loss_df['profit_&_lost'].values})
#-------------------------------------------------------------------------------------------------------------------------
# Description:sorting profit and loss ascendante; confifence level rank; Var calculation;CVar calculation
# Input:
# Output:
#-------------------------------------------------------------------------------------------------------------------------
def calculate_portfolio_Var_and_CVar(portfolio_profit_and_loss_df, confidence_level):
#sorting profit and loss ascendante
lportfolio_profit_and_loss_df = portfolio_profit_and_loss_df.sort_values(by='profit_&_lost', ascending=True)
lportfolio_profit_and_loss_df = portfolio_profit_and_loss_df.reset_index(drop=True)
#confifence level rank ( 95% confidence lavel)
rank = int((1-confidence_level)*len(lportfolio_profit_and_loss_df))-1
#Var calculation
VaR = portfolio_profit_and_loss_df.iloc[rank]['profit_&_lost']
#CVar calculation
port_folio_lost_beyond_VaR = portfolio_profit_and_loss_df[:rank]
CVaR = np.average(port_folio_lost_beyond_VaR)
return VaR, CVaR
#--------------------------------------------------------------------------------------------------------------------------
#Decription: Profit and loss summary statistics. Minimum lost, maximum lost, mean(moderate lost)lost standart deviation,
# Value at risk(Var),Conditional value-at-risk (CVaR)
#Input :portfolio_profit_and_loss_df,VaR,CVaR
#Output:
#--------------------------------------------------------------------------------------------------------------------------
def profit_and_loss_summary_statistics(portfolio_profit_and_loss_df,VaR,CVaR):
VaR_and_CVaR_df = pd.DataFrame([{'VaR':VaR, 'CVaR':CVaR}]).transpose()
VaR_and_CVaR_df = VaR_and_CVaR_df.rename(columns={0:'profit_&_lost'})
portfolio_profit_and_loss_stat_df = portfolio_profit_and_loss_df.agg(['min', 'max', 'mean', 'std'])
return pd.concat([portfolio_profit_and_loss_stat_df,VaR_and_CVaR_df], ignore_index=False)
# Plot a histogram
def profit_lost_summary(p_portfolio_profit_and_loss_df, p_portfolio_profit_and_loss_time_horizons_df):
fig, ax = plt.subplots(1,2, figsize=(20, 8))
p_portfolio_profit_and_loss_df.plot.kde(ax=ax[0], legend=True, title='Distribution: Profit & Lost')
p_portfolio_profit_and_loss_df.plot.hist(density=True, ax=ax[0])
ax[0].set_ylabel('Probability')
ax[0].grid(axis='y')
ax[0].set_facecolor('#d8dcd6')
bars = p_portfolio_profit_and_loss_time_horizons_df.plot(kind='bar', ax=ax[1], colormap='viridis', alpha=0.7, width=2, title='Profit & Lost Time Horizons')
# Customize the plot
#ax.set_title('Marks by Period for Each Student')
ax[1].set_xlabel('Time Horizon')
ax[1].set_ylabel('Value')
ax[1].grid( axis='y')
ax[1].spines['top'].set_visible(True)
ax[1].spines['right'].set_visible(True)
ax[1].spines['bottom'].set_visible(True)
ax[1].spines['left'].set_visible(True)
ax[1].axes.get_yaxis().set_visible(False)
# Display bar values on top of each bar
for bar in bars.patches:
height = bar.get_height()
ax[1].text(
bar.get_x() + bar.get_width() / 2, # x-coordinate
height - 0.3, # y-coordinate
f'{height}', # text
ha='center', # horizontal alignment
va='bottom' # vertical alignment
)
ax[1].legend(title='statistics', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.subplots_adjust(wspace=0.05)
plt.show()
def portfolio_profit_and_loss_time_horizons_df(p_portfolio_daily_profit_and_loss_df):
time_horizons = {
'Daily': 1,
'Weekly': 5,
'Biweekly': 10,
'Monthly': 21,
'Quarterly': 63,
'Annual': 252
}
portfolio_profit_and_loss_time_horizons_list = []
portfolio_profit_and_loss_time_horizons_df =pd.DataFrame()
for horizon, days in time_horizons.items():
portfolio_profit_and_loss_time_horizons_list = []
for value in p_portfolio_daily_profit_and_loss_df['profit_&_lost']:
scaling_factor = np.sqrt(days)
portfolio_profit_and_loss_time_horizons_list.append(round(value * scaling_factor,1))
portfolio_profit_and_loss_time_horizons_df[[ horizon + 'Profit & Lost']] =pd.DataFrame({ horizon + 'Profit & Lost': portfolio_profit_and_loss_time_horizons_list})
portfolio_profit_and_loss_time_horizons_df.index = p_portfolio_daily_profit_and_loss_df.index
portfolio_profit_and_loss_time_horizons_df = portfolio_profit_and_loss_time_horizons_df.rename(index={'Statistics': portfolio_profit_and_loss_time_horizons_df.index})
return portfolio_profit_and_loss_time_horizons_df
def summary_statistics_graph_and_table(portfolio_profit_and_loss_df,portfolio_profit_and_loss_time_horizons_df):
profit_lost_summary(portfolio_profit_and_loss_df,portfolio_profit_and_loss_time_horizons_df)
profit_and_loss_summary_statistics_df = profit_and_loss_summary_statistics(portfolio_profit_and_loss_df,VaR,CVaR)
display(portfolio_profit_and_loss_time_horizons_df.T)
#-------------------------------------------------------Data Setting----------------------------------------------------------
covar_mat = variance_covariance_matrix(most_diversify_portfolio_assets_log_returns_df)
cholesky_matrix_data_df = create_cholesky_matrix(covar_mat)
#daily_returns_list_df = simulate_daily_returns(Correlated_Normals_Z,Correlated_Normals_Z_arr, log_returns.mean(),log_returns.std(),1)
Z, Z_df = simulate_uncorelated_normal_distribution(covar_mat,10000)
Correlated_Normals_Z, Correlated_Normals_Z_arr= generate_correlated_normal_distribution(cholesky_matrix_data_df,Z)
daily_returns_df = simulate_daily_returns(Correlated_Normals_Z,Correlated_Normals_Z_arr,
most_diversify_portfolio_assets_log_returns_df.mean(),
most_diversify_portfolio_assets_log_returns_df.std(),1)
future_stock_price_df = stock_prices_simulation(most_diversify_portfolio_assets_initial_prices,daily_returns_df)
simulated_portfolio_price_df= simulated_portfolio_price(future_stock_price_df)
initial_portfolio_prices = calculate_initial_portfolio_price(most_diversify_portfolio_assets_initial_prices)
portfolio_profit_and_loss_df = calculate_prtfolio_profit_and_loss(simulated_portfolio_price_df, initial_portfolio_prices)
simulated_portfolio_price_profit_and_Loss_df = set_portfolio_price_profit_and_Loss_simulation_df(simulated_portfolio_price_df,
portfolio_profit_and_loss_df)
VaR, CVaR = calculate_portfolio_Var_and_CVar(portfolio_profit_and_loss_df, 0.95)
profit_and_loss_summary_statistics_df = profit_and_loss_summary_statistics(portfolio_profit_and_loss_df,VaR,CVaR)
portfolio_profit_and_loss_time_horizons_df = portfolio_profit_and_loss_time_horizons_df(profit_and_loss_summary_statistics_df)
#----------------------------------------------------------Data Printing ----------------------------------------------------
plotting_heatmap_for_correlation_matrix(most_diversify_portfolio_assets_log_returns_df,
'Correlation Matrix of the Most Diversified portfolio Asset Log Returns')
plotting_heatmap_for_covariance_matrix(covar_mat, 'Covariance Matrix of the Most Diversified portfolio Asset Log Returns')
plotting_heatmap_for_covariance_matrix(cholesky_matrix_data_df, 'Cholesky Matrix of the Most Diversified portfolio Asset Log Returns')
print('\n Uncorrelated normal Z simulation\n')
display(Z_df)
print('\nCorrelated Normal Z distribution\n')
display(Correlated_Normals_Z)
print('\nDaily returns simulation\n')
display(daily_returns_df)
print('\nFuture stock price simulation\n')
display(future_stock_price_df)
print('initial_portfolio_prices')
display(initial_portfolio_prices)
print('\nSimulated Portfolio Prices - Profit & Lost')
display(simulated_portfolio_price_profit_and_Loss_df)
summary_statistics_graph_and_table(portfolio_profit_and_loss_df, portfolio_profit_and_loss_time_horizons_df.T)
Uncorrelated normal Z simulation
| IGM | CNQ | DOL | DOO | BN | PEY | ENB | BMO | TD | NGD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -1.461680 | 0.731590 | 0.810047 | -0.082086 | -0.259016 | 0.189051 | -1.773228 | -1.266670 | 1.144792 | -1.618592 | 0.012877 |
| 1 | 0.483978 | 0.423525 | -1.770381 | 0.410302 | -0.028886 | -1.434581 | -1.187604 | -1.127606 | -0.407771 | -0.863672 | -0.848371 |
| 2 | 0.657793 | 0.753622 | 1.538440 | 0.515229 | -0.321961 | 0.188501 | 1.639347 | -0.237593 | -0.250557 | -0.349359 | 0.079098 |
| 3 | 0.617868 | -0.821943 | -0.218321 | 0.068942 | 0.690773 | 1.223608 | -0.539007 | -1.613701 | -0.804739 | 0.314620 | -0.901479 |
| 4 | -0.722329 | -0.386383 | -0.793887 | -0.235237 | -1.285817 | -0.405252 | 0.953291 | 3.262097 | 0.213284 | 2.239812 | 0.298679 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 9995 | -2.035752 | -0.793953 | -0.097404 | -0.392294 | 1.645817 | 1.182710 | -0.418856 | -0.823955 | -2.208061 | 0.032592 | 0.292743 |
| 9996 | -0.774168 | -1.673295 | -1.137698 | -1.072200 | 1.338937 | -1.007644 | 1.648649 | 1.007604 | -0.101299 | -2.052050 | -0.112984 |
| 9997 | 1.294729 | -1.421436 | -1.756090 | 0.806766 | -1.605433 | 0.943934 | 0.268175 | 0.009367 | 0.997463 | 0.141327 | -0.466869 |
| 9998 | -0.599911 | -1.245707 | 0.178666 | -1.053268 | -1.088356 | -0.291844 | 0.258986 | -1.381982 | 0.852582 | -1.609754 | -0.150586 |
| 9999 | 1.036581 | -1.073592 | 0.306125 | -1.003137 | 1.047839 | -0.112428 | -0.194010 | 0.632098 | 0.484087 | 0.362072 | -1.113140 |
10000 rows × 11 columns
Correlated Normal Z distribution
| IGM | CNQ | DOL | DOO | BN | PEY | ENB | BMO | TD | NGD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -0.025602 | 0.002590 | -0.003664 | -0.003429 | -0.015230 | -0.001756 | -0.019467 | -0.020175 | -0.002820 | -0.070456 | -0.002241 |
| 1 | 0.008477 | 0.017874 | -0.006296 | -0.005135 | -0.001256 | -0.015184 | -0.014876 | -0.018673 | -0.017608 | -0.036767 | -0.005106 |
| 2 | 0.011522 | 0.029301 | 0.020064 | 0.020910 | 0.021292 | 0.020065 | 0.036729 | 0.024130 | 0.021542 | 0.005931 | 0.002314 |
| 3 | 0.010822 | -0.015519 | 0.000054 | 0.000101 | 0.011766 | 0.011117 | -0.005548 | -0.013428 | -0.010368 | 0.019298 | -0.004143 |
| 4 | -0.012652 | -0.019763 | -0.013683 | -0.013852 | -0.036834 | -0.020416 | -0.008089 | 0.010910 | -0.002854 | 0.071850 | -0.000295 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 9995 | -0.035657 | -0.047399 | -0.022118 | -0.022174 | -0.014352 | -0.007107 | -0.024033 | -0.028717 | -0.037791 | -0.019578 | 0.001180 |
| 9996 | -0.013560 | -0.056609 | -0.022505 | -0.024839 | -0.013373 | -0.028983 | -0.010824 | -0.014633 | -0.019534 | -0.112615 | -0.003070 |
| 9997 | 0.022678 | -0.024054 | -0.007648 | -0.005492 | -0.023331 | -0.004314 | -0.011070 | -0.013816 | -0.004336 | 0.005176 | -0.002689 |
| 9998 | -0.010508 | -0.042433 | -0.009732 | -0.012344 | -0.032846 | -0.018938 | -0.017168 | -0.034929 | -0.017782 | -0.074595 | -0.003511 |
| 9999 | 0.018156 | -0.017445 | 0.006240 | 0.002964 | 0.023460 | 0.004008 | 0.000067 | 0.012019 | 0.011458 | 0.017084 | -0.007012 |
10000 rows × 11 columns
Daily returns simulation
| IGM | CNQ | DOL | DOO | BN | PEY | ENB | BMO | TD | NGD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.000145 | 1.000533 | 1.000139 | 1.000030 | 0.999868 | 1.000172 | 0.999878 | 0.999750 | 1.000055 | 0.996766 | 0.999964 |
| 1 | 1.000742 | 1.001003 | 1.000107 | 1.000010 | 1.000184 | 0.999974 | 0.999958 | 0.999779 | 0.999806 | 0.998217 | 0.999944 |
| 2 | 1.000795 | 1.001354 | 1.000427 | 1.000322 | 1.000694 | 1.000493 | 1.000864 | 1.000579 | 1.000464 | 1.000060 | 0.999995 |
| 3 | 1.000783 | 0.999977 | 1.000184 | 1.000072 | 1.000479 | 1.000361 | 1.000122 | 0.999877 | 0.999928 | 1.000637 | 0.999951 |
| 4 | 1.000372 | 0.999847 | 1.000017 | 0.999905 | 0.999380 | 0.999897 | 1.000077 | 1.000332 | 1.000054 | 1.002911 | 0.999977 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 9995 | 0.999969 | 0.998998 | 0.999915 | 0.999805 | 0.999888 | 1.000093 | 0.999798 | 0.999591 | 0.999467 | 0.998959 | 0.999987 |
| 9996 | 1.000356 | 0.998716 | 0.999910 | 0.999773 | 0.999910 | 0.999772 | 1.000029 | 0.999854 | 0.999774 | 0.994952 | 0.999958 |
| 9997 | 1.000991 | 0.999715 | 1.000091 | 1.000005 | 0.999685 | 1.000134 | 1.000025 | 0.999869 | 1.000029 | 1.000027 | 0.999961 |
| 9998 | 1.000409 | 0.999151 | 1.000065 | 0.999923 | 0.999470 | 0.999919 | 0.999918 | 0.999475 | 0.999803 | 0.996588 | 0.999955 |
| 9999 | 1.000912 | 0.999918 | 1.000259 | 1.000107 | 1.000743 | 1.000256 | 1.000220 | 1.000353 | 1.000294 | 1.000542 | 0.999931 |
10000 rows × 11 columns
Future stock price simulation
| IGM | CNQ | DOL | DOO | BN | PEY | ENB | BMO | TD | NGD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 36.217730 | 10.016877 | 38.591307 | 36.011345 | 27.436813 | 14.715933 | 25.497549 | 58.500936 | 45.860863 | 1.226022 | 22.428426 |
| 1 | 36.239356 | 10.021580 | 38.590074 | 36.010607 | 27.445479 | 14.713029 | 25.499603 | 58.502580 | 45.849476 | 1.227807 | 22.427984 |
| 2 | 36.241288 | 10.025098 | 38.602416 | 36.021871 | 27.459467 | 14.720653 | 25.522693 | 58.549436 | 45.879630 | 1.230074 | 22.429129 |
| 3 | 36.240844 | 10.011308 | 38.593047 | 36.012871 | 27.453556 | 14.718717 | 25.503775 | 58.508320 | 45.855051 | 1.230784 | 22.428132 |
| 4 | 36.225946 | 10.010003 | 38.586617 | 36.006838 | 27.423421 | 14.711897 | 25.502638 | 58.534960 | 45.860838 | 1.233581 | 22.428726 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 9995 | 36.211352 | 10.001511 | 38.582669 | 36.003240 | 27.437357 | 14.714775 | 25.495508 | 58.491590 | 45.833938 | 1.228719 | 22.428954 |
| 9996 | 36.225370 | 9.998683 | 38.582487 | 36.002087 | 27.437964 | 14.710045 | 25.501415 | 58.507001 | 45.847993 | 1.223791 | 22.428298 |
| 9997 | 36.248371 | 10.008684 | 38.589442 | 36.010453 | 27.431790 | 14.715379 | 25.501305 | 58.507895 | 45.859696 | 1.230034 | 22.428357 |
| 9998 | 36.227307 | 10.003037 | 38.588466 | 36.007490 | 27.425892 | 14.712217 | 25.498577 | 58.484794 | 45.849342 | 1.225803 | 22.428230 |
| 9999 | 36.245500 | 10.010716 | 38.595943 | 36.014109 | 27.460812 | 14.717180 | 25.506287 | 58.536174 | 45.871861 | 1.230666 | 22.427689 |
10000 rows × 11 columns
initial_portfolio_prices
316.50785970687866
Simulated Portfolio Prices - Profit & Lost
| simulated_portfolio_price | Simulated Portfolio Profit & Lost | |
|---|---|---|
| 0 | 316.503801 | -0.004059 |
| 1 | 316.527573 | 0.019713 |
| 2 | 316.681756 | 0.173896 |
| 3 | 316.556406 | 0.048546 |
| 4 | 316.525465 | 0.017606 |
| ... | ... | ... |
| 9995 | 316.429613 | -0.078246 |
| 9996 | 316.465135 | -0.042725 |
| 9997 | 316.531405 | 0.023545 |
| 9998 | 316.451154 | -0.056706 |
| 9999 | 316.616938 | 0.109079 |
10000 rows × 2 columns
| DailyProfit & Lost | WeeklyProfit & Lost | BiweeklyProfit & Lost | MonthlyProfit & Lost | QuarterlyProfit & Lost | AnnualProfit & Lost | |
|---|---|---|---|---|---|---|
| min | -0.2 | -0.5 | -0.7 | -1.1 | -1.9 | -3.7 |
| max | 0.5 | 1.0 | 1.4 | 2.1 | 3.6 | 7.2 |
| mean | 0.1 | 0.1 | 0.2 | 0.3 | 0.5 | 1.0 |
| std | 0.1 | 0.2 | 0.2 | 0.4 | 0.6 | 1.2 |
| VaR | 0.0 | 0.1 | 0.1 | 0.2 | 0.4 | 0.7 |
| CVaR | 0.1 | 0.1 | 0.2 | 0.3 | 0.5 | 1.0 |
Portfolio optimization involves determining the most effective asset allocation to achieve specific investment objectives, typically aiming to maximize returns while minimizing risk. In this section, we will calculate the portefolio expected return and volatility. We will then use Monte Carlo method to simulate 10000 trials the portefolio expected return and volatility. We will generate 10000 random portfolios with different asst allocations. we will plot these portfolios on a risk-return graph to create the Random Efficient Frontier. Our aim here is to select the optimal portfolios. Portfolios with the highest expected return and minimal risk. In order to achieve this, will use machine learning techniques to approximate a 2 degree polynomial function that perfectly fit the uper-bound of the random efficient frontier plot. Finally, we will combined K-means clustering technique with efficient frontier modeling to dig into the random generated portfolios and predict different type of investment risk tolerance(very conservative, conservative, moderate, aggressive and very aggressive)
def portfolio_random_weight_array_df(assets_returns_df):
#random portfolio weigh simulation
number_of_assets = len(assets_returns_df.columns.tolist())
random_array = np.random.rand(1,number_of_assets )
random_array_df = pd.DataFrame(random_array, columns = assets_returns_df.columns.tolist())
random_weight_df = random_array_df/random_array_df.values.sum()
return random_weight_df
def portfolio_expected_Return(random_weight_df,log_returns):
assets_expected_returns = log_returns.mean()
weited_expected_returns = assets_expected_returns * random_weight_df
portfolio_expected_return_ = weited_expected_returns.values.sum()
return 100*portfolio_expected_return_
def portfolio_volatility(varcovar,w):
transpose_w = w.T
σp = np.sqrt(np.matmul(np.matmul(w,varcovar),w.T))
return 100*σp[0][0]
def efficient_frontiere_plot(portfolio_trails_simulation_df):
display(portfolio_trails_simulation_df)
#fig, ax = plt.subplots()
portfolio_trails_simulation_df.plot(x='σp', y='E_rp', kind='scatter', figsize=(10, 6));
plt.xlabel('Expected Volatility')
plt.ylabel('Expected Return')
plt.title('Random portfolios Efficient Frontier')
#efficient_frontiere_plot(portfolio_trails_simulation_df)
def generate_excess_return(log_returns_df):
𝝁 = log_returns_df.mean()
𝝁_list = []
𝝁_list = 𝝁.values
X_df= pd.DataFrame(data=log_returns_df[0:0:],
index=log_returns_df.index.to_list(), #[i for i in range(log_returns_df.shape[0])],
columns=log_returns_df.columns.tolist())
assets_list = log_returns_df.columns.tolist()
for index in range(len(assets_list)):
X_df[assets_list[index]] = log_returns_df[assets_list[index]].values - 𝝁_list[index]
return X_df
#Portfolio Statistics
def portfolio_arihtmetics(log_returns_df,index_adj_close_price_df):
return pd.DataFrame({'mu expected_return':log_returns_df.mean(),
'variance':log_returns_df.var(),
'Sigmas(volatilities)':log_returns_df.std(),
'modifiy shape(Er)/𝝈':log_returns_df.mean()/log_returns_df.std(),
'initial price':index_adj_close_price_df.iloc[0]}).transpose()
def excess_return_varcovar(X_df):
return X_df.cov()
def get_uncorrelated_assets_index_adj_close_price_df(index_adj_close_price_df, uncorrelated_assets_list):
return index_adj_close_price_df[uncorrelated_assets_list]
def uncorelated_portfolio_trails_simulation(log_returns, most_diversify_portfolio_assets_list, trial):
σp_list = []
E_rp_list = []
random_weight_array_df_rows_list = []
excess_return_df = generate_excess_return(log_returns[most_diversify_portfolio_assets_list])
for i in range(0, trial):
#random_weight_array_df = portfolio_random_weight_array_df(uncorrelated_assets_returns_log_returns_df(log_returns,
# most_diversify_portfolio_assets_list))
random_weight_array_df = portfolio_random_weight_array_df(log_returns[most_diversify_portfolio_assets_list])
random_weight_array_df_rows_list.append(random_weight_array_df)
E_rp_list.append(portfolio_expected_Return(random_weight_array_df,log_returns[most_diversify_portfolio_assets_list]))
σp_list.append(portfolio_volatility(excess_return_varcovar(excess_return_df),random_weight_array_df))
uncorelated_portfolio_trails_simulation_df = pd.DataFrame({'σp':σp_list,'E_rp':E_rp_list}, index=[i for i in range(0,trial)])
σp = uncorelated_portfolio_trails_simulation_df['σp']
E_rp = uncorelated_portfolio_trails_simulation_df['E_rp']
sharpes_rat = E_rp/σp
uncorelated_portfolio_trails_simulation_sharpes_ratio_df = pd.DataFrame({'σp':σp,'E_rp':E_rp,'sharpes_ratio':sharpes_rat})
random_weight_array_all_rows_df = pd.concat(random_weight_array_df_rows_list, axis=0,ignore_index=True)
uncorrelated_weighted_portfolio_trails_simulation_df = uncorelated_portfolio_trails_simulation_sharpes_ratio_df.merge(random_weight_array_all_rows_df,
left_index=True, right_index=True)
return uncorelated_portfolio_trails_simulation_df,uncorelated_portfolio_trails_simulation_sharpes_ratio_df, \
random_weight_array_all_rows_df,uncorrelated_weighted_portfolio_trails_simulation_df
uncorelated_portfolio_trails_simulation_df,uncorelated_portfolio_trails_simulation_sharpes_ratio_df, random_weight_array_all_rows_df, \
uncorrelated_weighted_portfolio_trails_simulation_df = \
uncorelated_portfolio_trails_simulation(log_returns, most_diversify_portfolio_assets_list, 10000)
X_df =generate_excess_return(most_diversify_portfolio_assets_log_returns_df)
Excess_return_varcovar = excess_return_varcovar(X_df)
display(Excess_return_varcovar)
efficient_frontiere_plot(uncorrelated_weighted_portfolio_trails_simulation_df)
| IGM | CNQ | DOL | DOO | BN | PEY | ENB | BMO | TD | NGD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| IGM | 0.000307 | 0.000216 | 0.000153 | 0.000145 | 0.000263 | 0.000138 | 0.000150 | 0.000183 | 0.000158 | 0.000167 | 0.000019 |
| CNQ | 0.000216 | 0.000943 | 0.000237 | 0.000228 | 0.000382 | 0.000270 | 0.000387 | 0.000384 | 0.000332 | 0.000240 | 0.000018 |
| DOL | 0.000153 | 0.000237 | 0.000147 | 0.000141 | 0.000209 | 0.000138 | 0.000155 | 0.000177 | 0.000159 | 0.000143 | 0.000012 |
| DOO | 0.000145 | 0.000228 | 0.000141 | 0.000144 | 0.000204 | 0.000137 | 0.000149 | 0.000172 | 0.000154 | 0.000142 | 0.000014 |
| BN | 0.000263 | 0.000382 | 0.000209 | 0.000204 | 0.000511 | 0.000244 | 0.000263 | 0.000319 | 0.000278 | 0.000197 | 0.000027 |
| PEY | 0.000138 | 0.000270 | 0.000138 | 0.000137 | 0.000244 | 0.000216 | 0.000180 | 0.000207 | 0.000191 | 0.000141 | 0.000014 |
| ENB | 0.000150 | 0.000387 | 0.000155 | 0.000149 | 0.000263 | 0.000180 | 0.000308 | 0.000247 | 0.000219 | 0.000152 | 0.000015 |
| BMO | 0.000183 | 0.000384 | 0.000177 | 0.000172 | 0.000319 | 0.000207 | 0.000247 | 0.000350 | 0.000270 | 0.000149 | 0.000014 |
| TD | 0.000158 | 0.000332 | 0.000159 | 0.000154 | 0.000278 | 0.000191 | 0.000219 | 0.000270 | 0.000282 | 0.000139 | 0.000011 |
| NGD | 0.000167 | 0.000240 | 0.000143 | 0.000142 | 0.000197 | 0.000141 | 0.000152 | 0.000149 | 0.000139 | 0.001866 | 0.000029 |
| TVE | 0.000019 | 0.000018 | 0.000012 | 0.000014 | 0.000027 | 0.000014 | 0.000015 | 0.000014 | 0.000011 | 0.000029 | 0.000047 |
| σp | E_rp | sharpes_ratio | IGM | CNQ | DOL | DOO | BN | PEY | ENB | BMO | TD | NGD | TVE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.591041 | 0.047869 | 0.030087 | 0.103417 | 0.154074 | 0.045688 | 0.100111 | 0.069420 | 0.068036 | 0.033478 | 0.130465 | 0.129491 | 0.130359 | 0.035460 |
| 1 | 1.461517 | 0.047157 | 0.032266 | 0.284562 | 0.020576 | 0.011801 | 0.095588 | 0.057754 | 0.032894 | 0.117483 | 0.157138 | 0.105801 | 0.093760 | 0.022643 |
| 2 | 1.867592 | 0.054864 | 0.029377 | 0.133012 | 0.128207 | 0.021695 | 0.044347 | 0.025555 | 0.094248 | 0.004054 | 0.057632 | 0.174658 | 0.294028 | 0.022565 |
| 3 | 1.423179 | 0.042317 | 0.029734 | 0.152672 | 0.025634 | 0.025288 | 0.045849 | 0.041366 | 0.076287 | 0.155829 | 0.118562 | 0.107588 | 0.148332 | 0.102592 |
| 4 | 1.282241 | 0.038346 | 0.029906 | 0.141923 | 0.040289 | 0.110578 | 0.118217 | 0.135731 | 0.027192 | 0.140439 | 0.012431 | 0.043684 | 0.075764 | 0.153753 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 9995 | 1.279847 | 0.037943 | 0.029647 | 0.183991 | 0.012637 | 0.167984 | 0.118589 | 0.066875 | 0.106072 | 0.015518 | 0.017766 | 0.134697 | 0.084626 | 0.091246 |
| 9996 | 1.345417 | 0.038060 | 0.028289 | 0.153573 | 0.003607 | 0.083775 | 0.096908 | 0.012720 | 0.101108 | 0.018303 | 0.113543 | 0.076000 | 0.176751 | 0.163713 |
| 9997 | 1.636396 | 0.045658 | 0.027902 | 0.008815 | 0.129299 | 0.082767 | 0.010676 | 0.147832 | 0.106178 | 0.121826 | 0.081679 | 0.133596 | 0.140138 | 0.037195 |
| 9998 | 1.413588 | 0.032336 | 0.022875 | 0.017752 | 0.055286 | 0.053208 | 0.223711 | 0.151653 | 0.092482 | 0.189462 | 0.026228 | 0.133031 | 0.003929 | 0.053257 |
| 9999 | 1.380792 | 0.041466 | 0.030030 | 0.118761 | 0.050469 | 0.127756 | 0.097175 | 0.086228 | 0.069522 | 0.121345 | 0.000650 | 0.005177 | 0.168017 | 0.154899 |
10000 rows × 14 columns
#--------------------------------------------------Efficient Frontiere Optimal Points-----------------------------------------------
# get data frame top1 portfolio
# selecting the optimal portfolios:portfolios with expected return higher or equal the minimun risky portfolio
#sort the optimal portfolio data frame by selected value:ascending=True
#return the data frame
#---------------------------------------------------------------------------------------------------------------------------------
def efficient_frontiere_selected_sharpe_ratio_portfolio_df(uncorrelated_weighted_portfolio_trails_simulation_df,selected_col):
uncorrelated_weighted_portfolio_trails_simulation_sorted_df = uncorrelated_weited_portfolio_trails_simulation_df.sort_values(by='sharpes_ratio', ascending=False)
uncorrelated_weighted_portfolio_trails_simulation_sorted_df = uncorrelated_weited_portfolio_trails_simulation_sorted_df.reset_index(drop=True)
top1_sharpe_ratio_value = uncorrelated_weighted_portfolio_trails_simulation_sorted_df['sharpes_ratio'].values[0]
top1_E_rp_value= uncorrelated_weighted_portfolio_trails_simulation_sorted_df['E_rp'].values[0]
top1_σp_value = uncorrelated_weighted_portfolio_trails_simulation_sorted_df['σp'].values[0]
# selecting the optimal portfolios:portfolios with expected return higher or equal the minimun risky portfolio
if selected_col == 'sharpes_ratio':
uncorrelated_weighted_portfolio_trails_simulation_selected_sharpes_ratio_optimal_portfolios_df = \
uncorrelated_weighted_portfolio_trails_simulation_sorted_df[uncorrelated_weighted_portfolio_trails_simulation_sorted_df[selected_col] >= top1_sharpe_ratio_value]
elif selected_col == 'E_rp':
uncorrelated_weighted_portfolio_trails_simulation_selected_sharpes_ratio_optimal_portfolios_df = uncorrelated_weighted_portfolio_trails_simulation_sorted_df
# sort the optimal portfolio data frame
uncorrelated_weighted_portfolio_trails_simulation_selected_sharpes_ratio_optimal_portfolios_df = \
uncorrelated_weighted_portfolio_trails_simulation_selected_sharpes_ratio_optimal_portfolios_df.sort_values(by='σp', ascending=True)
uncorrelated_weighted_portfolio_trails_simulation_selected_sharpes_ratio_optimal_portfolios_df = \
uncorrelated_weighted_portfolio_trails_simulation_selected_sharpes_ratio_optimal_portfolios_df.reset_index(drop=True)
return uncorelated_portfolio_trails_simulation_selected_sharpes_ratio_optimal_portfolios_df
#----------------------------------------------------------------------------------------
def efficient_frontiere_optimal_sharpe_ratio_portfolios_model_points(uncorrelated_weighted_portfolio_trails_simulation_df,number_of_top_points = 35):
#sort from maximum sharpe ratio and get top sharpe ratio portfolios
portfolio_trails_simulation_sharpes_ratio_top_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='sharpes_ratio',
ascending=False)
portfolio_trails_simulation_sharpes_ratio_top_df = portfolio_trails_simulation_sharpes_ratio_top_df.reset_index(drop=True)
uncorelated_portfolio_trails_simulation_sharpes_ratio_top_df =portfolio_trails_simulation_sharpes_ratio_top_df.head(number_of_top_points)
xpoints_list = []
ypoints_list = []
top_sharpe_ratio_value_points_list = []
for portfolio_number in range(number_of_top_points):
#top shape ratio
top_sharpe_ratio_value_points_list.append(portfolio_trails_simulation_sharpes_ratio_top_df['sharpes_ratio'].values[portfolio_number])
xpoints_list.append(portfolio_trails_simulation_sharpes_ratio_top_df['σp'].values[portfolio_number])
ypoints_list.append(portfolio_trails_simulation_sharpes_ratio_top_df['E_rp'].values[portfolio_number])
xpoints = np.array(xpoints_list)
ypoints = np.array(ypoints_list)
top_sharpe_ratio_value_points = np.array(top_sharpe_ratio_value_points_list)
return xpoints, ypoints, top_sharpe_ratio_value_points
def get_maximun_return_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df):
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='E_rp',
ascending=False)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df.reset_index(drop=True)
max_E_rp_sharpe_ratio = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['sharpes_ratio'].values[0]
max_E_rp = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['E_rp'].values[0]
max_E_rp_σp = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['σp'].values[0]
return max_E_rp_sharpe_ratio, max_E_rp, max_E_rp_σp
def get_maximun_risk_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df):
# here the portfolios are sotrted from maximum risk
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df = \
uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='σp', ascending=False)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df = \
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df.reset_index(drop=True)
max_σp_E_rp_sharpe_ratio = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df['sharpes_ratio'].values[0]
max_σp_E_rp = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df['E_rp'].values[0]
max_σp = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df['σp'].values[0]
return max_σp_E_rp_sharpe_ratio, max_σp_E_rp, max_σp
def get_minimum_risk_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df):
# here the portfolios are sotrted from minimum risk
portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df = \
uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='σp', ascending=True)
portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df.reset_index(drop=True)
minimun_σp_E_rp_sharpe_ratio = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['sharpes_ratio'].values[0]
minimun_σp_E_rp = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['E_rp'].values[0]
minimun_σp = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['σp'].values[0]
return minimun_σp_E_rp_sharpe_ratio, minimun_σp_E_rp, minimun_σp
def get_maximum_sharpe_ratio(uncorrelated_weighted_portfolio_trails_simulation_df):
#sort from maximum sharpe ratio and get top sharpe ratio portfolios
portfolio_trails_simulation_sharpes_ratio_top_df = \
uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='sharpes_ratio', ascending=False)
portfolio_trails_simulation_sharpes_ratio_top_df = portfolio_trails_simulation_sharpes_ratio_top_df.reset_index(drop=True)
maximum_sharpe_ratio = portfolio_trails_simulation_sharpes_ratio_top_df['sharpes_ratio'].values[0]
maximum_sharpe_ratio_σp_E_rp = portfolio_trails_simulation_sharpes_ratio_top_df['E_rp'].values[0]
maximum_sharpe_ratio_σp = portfolio_trails_simulation_sharpes_ratio_top_df['σp'].values[0]
return maximum_sharpe_ratio, maximum_sharpe_ratio_σp_E_rp, maximum_sharpe_ratio_σp
#-----------------------------------------------------------------------------------------------------------------------------------
def efficient_frontiere_optimal_portfolios_model_points(uncorrelated_weighted_portfolio_trails_simulation_df,number_of_top_points = 35):
#number_of_top_points = 35
#sort from maximum sharpe ratio and get top sharpe ratio portfolios
portfolio_trails_simulation_sharpes_ratio_top_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='sharpes_ratio', ascending=False)
portfolio_trails_simulation_sharpes_ratio_top_df = portfolio_trails_simulation_sharpes_ratio_top_df.reset_index(drop=True)
uncorelated_portfolio_trails_simulation_sharpes_ratio_top_df =portfolio_trails_simulation_sharpes_ratio_top_df.head(number_of_top_points)
# minimum risk portfolio: here the portfolios are sotrted from minimum risk
portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='σp', ascending=True)
portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df.reset_index(drop=True)
portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df.head(number_of_top_points)
minimun_σp_E_rp_sharpe_ratio = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['sharpes_ratio'].values[0]
minimun_σp_E_rp = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['E_rp'].values[0]
minimun_σp = portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['σp'].values[0]
# maximun return portfolio
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='E_rp', ascending=False)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df.reset_index(drop=True)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df.head(number_of_top_points)
max_E_rp_sharpe_ratio = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['sharpes_ratio'].values[0]
max_E_rp = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['E_rp'].values[0]
max_E_rp_σp = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['σp'].values[0]
# maximun risk portfolio: here the portfolios are sotrted from maximum risk
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(by='σp', ascending=False)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df.reset_index(drop=True)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_selecte_df.head(number_of_top_points)
xpoints_list = []
ypoints_list = []
top_sharpe_ratio_value_points_list = []
for portfolio_number in range(number_of_top_points):
#top shape ratio
top_sharpe_ratio_value_points_list.append(portfolio_trails_simulation_sharpes_ratio_top_df['sharpes_ratio'].values[portfolio_number])
xpoints_list.append(portfolio_trails_simulation_sharpes_ratio_top_df['σp'].values[portfolio_number])
ypoints_list.append(portfolio_trails_simulation_sharpes_ratio_top_df['E_rp'].values[portfolio_number])
# minimum risk portfolio:
top_sharpe_ratio_value_points_list.append(portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['sharpes_ratio'].values[portfolio_number])
xpoints_list.append(portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['σp'].values[portfolio_number])
ypoints_list.append(portfolio_trails_simulation_sharpes_ratio_minun_σp_E_rp_df['E_rp'].values[portfolio_number])
# maximun return portfolio
top_sharpe_ratio_value_points_list.append(portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['sharpes_ratio'].values[portfolio_number])
xpoints_list.append(portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['σp'].values[portfolio_number])
ypoints_list.append(portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df['E_rp'].values[portfolio_number])
xpoints = np.array(xpoints_list)
ypoints = np.array(ypoints_list)
top_sharpe_ratio_value_points = np.array(top_sharpe_ratio_value_points_list)
return xpoints.clip(minimun_σp,max_E_rp_σp),ypoints.clip(minimun_σp_E_rp,max_E_rp), \
top_sharpe_ratio_value_points.clip(minimun_σp_E_rp_sharpe_ratio,max_E_rp_sharpe_ratio)
def get_maximun_minimum_points(df):
# maximun return portfolio
max_df = df.sort_values(by='E_rp', ascending=False)
max_df = max_df.reset_index(drop=True)
max_df = max_df.head(1)
max_E_rp_σp = max_df['σp']
max_E_rp = max_df['E_rp']
max_E_rp_sharpe_ratio = max_df['sharpes_ratio']
# minimum return portfolio
min_df = df.sort_values(by='E_rp', ascending=True)
min_df = min_df.reset_index(drop=True)
min_df = min_df.head(1)
minimun_σp = min_df['σp']
minimun_σp_E_rp = min_df['E_rp']
minimun_σp_E_rp_sharpe_ratio = min_df['sharpes_ratio']
return max_E_rp_σp, max_E_rp, max_E_rp_sharpe_ratio, minimun_σp, minimun_σp_E_rp, minimun_σp_E_rp_sharpe_ratio
#-----------------------------------------Efficient Frontiere Model Plotting-----------------------------------------------------------------------------------
#call efficient_frontiere_optimal_portfolios_df to include sharpe ration dataframe to the trails protfolios dataframe
#and select the optimal portfolios
#prepare data for plotting and create the scatter plot
#include sharpe ration dataframe to the trails protfolios dataframe
#select the optimal portfolios(portfolios with expected return higher or equal to the minimumal risk portfolio)
#sorted by sharpe ration efficient_frontiere_selected_sharpe_ratio_portfolio_df
#------------------------------------------------------------------------------------------------------------------------------
def plot_fitted_curve(uncorrelated_weighted_portfolio_trails_simulation_df,fig, ax, label, marker, color ):
#points plotting
xpoints,ypoints,top_sharpe_ratio_value_points = \
efficient_frontiere_optimal_portfolios_model_points(uncorrelated_weighted_portfolio_trails_simulation_df,7)
row, col = uncorrelated_weighted_portfolio_trails_simulation_df.shape
#--model definition---
mymodel = np.poly1d(np.polyfit(xpoints, ypoints,2))
popt = np.polyfit(xpoints, ypoints,2)
a, b, c = popt
poly_d2_form = str('y =%.5f * x^2 + %.5f * x + %.5f' % (a, b, c))
display(np.polyfit(xpoints, ypoints,2))
myline = np.linspace(xpoints.min(), xpoints.max(), row)
# optimal portfolios plotting
ypred = mymodel(myline)
ax.plot(xpoints,ypoints,'*',color='red',label='Optimal portfolios')
ax.plot(myline, mymodel(myline),'.',color="blue",label=label + ':\n'+poly_d2_form)
print(r2_score(ypoints, mymodel(xpoints)))
def plot_random_portfolios(uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax, colorbar = 'yes'):
#random portfolio plotting
optimal_portfolios_df = uncorrelated_weighted_portfolio_trails_simulation_df
sharpes_ratio_optimal_portfolios_σp_col = optimal_portfolios_df['σp']
sharpes_ratio_optimal_portfolios_E_rp_col = optimal_portfolios_df['E_rp']
optimal_portfolios_sharpes_ratio_col = optimal_portfolios_df['sharpes_ratio']
scplt = ax.scatter(sharpes_ratio_optimal_portfolios_σp_col, sharpes_ratio_optimal_portfolios_E_rp_col, marker="o",
c=optimal_portfolios_sharpes_ratio_col, cmap="viridis",label='Random Portfolios')
if colorbar == 'yes':
cb = fig.colorbar(scplt, ax=ax, label='Sharpe Ratio')
ax.set_title("Towards an Efficient Frontier Model - Random portfolios Efficient Frontier")
def plot_fitted_curve_and_random_portfolios(uncorrelated_weighted_portfolio_trails_simulation_df):
fig, ax =plt.subplots(figsize=(12, 5))
plot_fitted_curve(uncorrelated_weighted_portfolio_trails_simulation_df,fig, ax, label='Model to Approximate', marker= '*', color='red')
plot_random_portfolios(uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax)
ax.legend(prop = { "size": 8 })
#plot_fitted_curve_and_random_portfolios(uncorrelated_weighted_portfolio_trails_simulation_df)
#----------------------------------------------------------------------------------
# minimum risk portfolio: here the portfolios are sotrted from minimum risk
#-----------------------------------------------------------------------------------
def portfolio_strategy_minimum_risk(uncorrelated_weighted_portfolio_trails_simulation_df,number_of_top_points, most_diversify_portfolio_assets_list):
portfolio_trails_simulation_minimum_risk_σp_E_rp_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(
by='σp', ascending=True)
portfolio_trails_simulation_minimum_risk_σp_E_rp_df = portfolio_trails_simulation_minimum_risk_σp_E_rp_df.reset_index(drop=True)
portfolio_trails_simulation_minimum_risk_σp_E_rp_df = portfolio_trails_simulation_minimum_risk_σp_E_rp_df.head(number_of_top_points)
portfolio_weight_df = portfolio_trails_simulation_minimum_risk_σp_E_rp_df[most_diversify_portfolio_assets_list]
portfolio_weight_df = portfolio_weight_df*100
portfolio_weight_df1 = portfolio_weight_df.head(1)
portfolio_weight =portfolio_weight_df1.columns.values.tolist()
asset_tickers = portfolio_weight_df1.iloc[0].tolist()
portfolio_investment_strategy_df = pd.DataFrame({'Portfolio Weight':portfolio_weight,'Asset Tickers':asset_tickers})
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Asset Tickers',ascending=True)
portfolio_investment_strategy_Trans_df = portfolio_investment_strategy_df.transpose()
strategy_Weight = portfolio_investment_strategy_df['Portfolio Weight']
strategy_tickers = portfolio_investment_strategy_df['Asset Tickers']
return strategy_Weight, strategy_tickers, portfolio_trails_simulation_minimum_risk_σp_E_rp_df
#----------------------------------------------------------------------------
#maximun risk portfolio: here the portfolios are sotrted from minimum risk
#----------------------------------------------------------------------------
def portfolio_strategy_maximun_risk(uncorrelated_weighted_portfolio_trails_simulation_df,number_of_top_points):
#log_returns,threshold
portfolio_trails_simulation_max_risk_σp_E_rp_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(
by='σp', ascending=False)
portfolio_trails_simulation_max_risk_σp_E_rp_df = portfolio_trails_simulation_max_risk_σp_E_rp_df.reset_index(drop=True)
portfolio_trails_simulation_max_risk_σp_E_rp_df = portfolio_trails_simulation_max_risk_σp_E_rp_df.head(number_of_top_points)
portfolio_weight_df = portfolio_trails_simulation_max_risk_σp_E_rp_df[most_diversify_portfolio_assets_list]
portfolio_weight_df = portfolio_weight_df*100
portfolio_weight_df1 = portfolio_weight_df.head(1)
portfolio_weight =portfolio_weight_df1.columns.values.tolist()
asset_tickers = portfolio_weight_df1.iloc[0].tolist()
portfolio_investment_strategy_df = pd.DataFrame({'Portfolio Weight':portfolio_weight,'Asset Tickers':asset_tickers})
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Asset Tickers',ascending=True)
portfolio_investment_strategy_Trans_df = portfolio_investment_strategy_df.transpose()
#display(portfolio_investment_strategy_Trans_df)
strategy_Weight = portfolio_investment_strategy_df['Portfolio Weight']
strategy_tickers = portfolio_investment_strategy_df['Asset Tickers']
return strategy_Weight, strategy_tickers, portfolio_trails_simulation_max_risk_σp_E_rp_df
#---------------------------------
# maximun return portfolio
#--------------------------------
def portfolio_strategy_maximun_return(uncorrelated_weighted_portfolio_trails_simulation_df,number_of_top_points):
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(
by='E_rp', ascending=False)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df.reset_index(drop=True)
portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df.head(number_of_top_points)
portfolio_weight_df = portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df[most_diversify_portfolio_assets_list]
portfolio_weight_df = portfolio_weight_df*100
portfolio_weight_df1 = portfolio_weight_df.head(1)
portfolio_weight =portfolio_weight_df1.columns.values.tolist()
asset_tickers = portfolio_weight_df1.iloc[0].tolist()
portfolio_investment_strategy_df = pd.DataFrame({'Portfolio Weight':portfolio_weight,'Asset Tickers':asset_tickers})
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Asset Tickers',ascending=True)
portfolio_investment_strategy_Trans_df = portfolio_investment_strategy_df.transpose()
#display(portfolio_investment_strategy_Trans_df)
strategy_Weight = portfolio_investment_strategy_df['Portfolio Weight']
strategy_tickers = portfolio_investment_strategy_df['Asset Tickers']
return strategy_Weight, strategy_tickers, portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df
#-------------------------------------------------------------------------
#sort from maximum sharpe ration and get top sharpe ratio portfolios
#-------------------------------------------------------------------------
def portfolio_strategy_top_sharpe_ratio(uncorrelated_weighted_portfolio_trails_simulation_df,number_of_top_points):
portfolio_trails_simulation_sharpes_ratio_top_df = uncorrelated_weighted_portfolio_trails_simulation_df.sort_values(
by='sharpes_ratio', ascending=False)
portfolio_trails_simulation_sharpes_ratio_top_df = portfolio_trails_simulation_sharpes_ratio_top_df.reset_index(drop=True)
uncorrelated_portfolio_trails_simulation_sharpes_ratio_top_df = portfolio_trails_simulation_sharpes_ratio_top_df.head(number_of_top_points)
portfolio_weight_df = uncorrelated_portfolio_trails_simulation_sharpes_ratio_top_df[most_diversify_portfolio_assets_list]
portfolio_weight_df = portfolio_weight_df*100
portfolio_weight_df1 = portfolio_weight_df.head(1)
portfolio_weight =portfolio_weight_df1.columns.values.tolist()
asset_tickers = portfolio_weight_df1.iloc[0].tolist()
portfolio_investment_strategy_df = pd.DataFrame({'Portfolio Weight':portfolio_weight,'Asset Tickers':asset_tickers})
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Asset Tickers',ascending=True)
portfolio_investment_strategy_Trans_df = portfolio_investment_strategy_df.transpose()
#display(portfolio_investment_strategy_Trans_df)
strategy_Weight = portfolio_investment_strategy_df['Portfolio Weight']
strategy_tickers = portfolio_investment_strategy_df['Asset Tickers']
return strategy_Weight, strategy_tickers, uncorrelated_portfolio_trails_simulation_sharpes_ratio_top_df
#------------------------------------------------------------------------
#sort from maximum sharpe ratio and get top sharpe ratio portfolios
#------------------------------------------------------------------------
def portfolio_strategy_plotting(uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, number_of_top_points):
fig, ax =plt.subplots(2,2, figsize=(14, 10))
strategy_Weight, strategy_tickers,uncorrelated_portfolio_trails_simulation_sharpes_ratio_top_df = \
portfolio_strategy_top_sharpe_ratio(uncorrelated_weighted_portfolio_trails_simulation_df, number_of_top_points)
bar_container= ax[0,0].barh(strategy_Weight,strategy_tickers)
# setting label of y-axis
ax[0,0].set_ylabel("Asset Tickers")
# setting label of x-axis
#ax[0,0].set_xlabel("Portfolio Weight")
ax[0,0].set_title("Maximum Sharpe Ratio Portfolio Assets Allocation")
ax[0,0].bar_label(bar_container, fmt='{:,.0f}%')
# maximun return portfolio
strategy_Weight, strategy_tickers,portfolio_trails_simulation_sharpes_ratio_max_σp_E_rp_df = \
portfolio_strategy_maximun_return(uncorrelated_weighted_portfolio_trails_simulation_df, number_of_top_points)
bar_container= ax[0,1].barh(strategy_Weight,strategy_tickers)
# setting label of y-axis
ax[0,1].set_ylabel("Asset Tickers")
# setting label of x-axis
#ax[0,1].set_xlabel("Portfolio Weight")
ax[0,1].set_title("Maximun Return Portfolio Assets Allocation")
ax[0,1].bar_label(bar_container, fmt='{:,.0f}%')
# maximun risk portfolio: here the portfolios are sotrted from minimum risk
strategy_Weight, strategy_tickers,portfolio_trails_simulation_max_risk_σp_E_rp_df = \
portfolio_strategy_maximun_risk(uncorrelated_weighted_portfolio_trails_simulation_df, number_of_top_points)
bar_container= ax[1,0].barh(strategy_Weight,strategy_tickers)
# setting label of y-axis
ax[1,0].set_ylabel("Asset tickers")
# setting label of x-axis
#ax[1,0].set_xlabel("Portfolio Weight")
ax[1,0].set_title("Maximun risk Portfolio Assets Allocation")
ax[1,0].bar_label(bar_container, fmt='{:,.0f}%')
# minimum risk portfolio: here the portfolios are sotrted from maximun risk
strategy_Weight, strategy_tickers,portfolio_trails_simulation_minimum_risk_σp_E_rp_df = \
portfolio_strategy_minimum_risk(uncorrelated_weighted_portfolio_trails_simulation_df, number_of_top_points, most_diversify_portfolio_assets_list)
bar_container= ax[1,1].barh(strategy_Weight,strategy_tickers)
# setting label of y-axis
ax[1,1].set_ylabel("Asset Tickers")
# setting label of x-axis
#ax[1,1].set_xlabel("Portfolio Weight")
ax[1,1].set_title("Minimum risk Portfolio Assets Allocation")
ax[1,1].bar_label(bar_container, fmt='{:,.0f}%')
plt.show()
plot_fitted_curve_and_random_portfolios(uncorrelated_weighted_portfolio_trails_simulation_df)
portfolio_strategy_plotting(uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, 10)
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
# Build the model
def model_poly_d2(x, a, b, c):
return b * x**2 + a * x + c
# Data Splitting / Model Selection
def polynomial_degree2_model(uncorrelated_weighted_portfolio_trails_simulation_df):
# Load the data : original random portfolios data points
xpoints, ypoints, original_random_sharpe_ratio = \
efficient_frontiere_optimal_portfolios_model_points( uncorrelated_weighted_portfolio_trails_simulation_df)
#Split tranning, validation and testing data
x_train_poly_d2, x_test_poly_d2, y_train_poly_d2, y_test_poly_d2 = \
train_test_split(xpoints, ypoints, test_size=0.3, random_state=42)
x_model_validation_poly_d2, x_model_testing_poly_d2 = train_test_split(np.linspace(min(xpoints), max(xpoints),
len(xpoints)), test_size=0.3, random_state=42)
# model traning to get paarameters
popt_poly_d2, pcov_poly_d2 = curve_fit(model_poly_d2, x_train_poly_d2,y_train_poly_d2, maxfev=50000)
a, b, c = popt_poly_d2
poly_d2_form = str('y =%.5f * x^2 + %.5f * x + %.5f' % (a, b, c))
return x_train_poly_d2, x_test_poly_d2, y_train_poly_d2, y_test_poly_d2,popt_poly_d2, \
pcov_poly_d2, x_model_validation_poly_d2, x_model_testing_poly_d2, model_poly_d2, poly_d2_form
#x_train_poly_d2, x_test_poly_d2, y_train_poly_d2, y_test_poly_d2,popt_poly_d2, \
# pcov_poly_d2, x_model_validation_poly_d2, x_model_testing_poly_d2, model_poly_d2, poly_d2_form = \
# polynomial_degree2_model(uncorrelated_weighted_portfolio_trails_simulation_df)
# Build the model
def model_poly_d3_log(x, a, b, c, d, e):
return a * np.log(abs(b )* x) + c*x**3 +d*x**2 + e
def polynomial_degree3_log_model(uncorrelated_weighted_portfolio_trails_simulation_df):
# Load the data : original random portfolios data points
xpoints,ypoints,original_random_sharpe_ratio = efficient_frontiere_optimal_portfolios_model_points(uncorrelated_weighted_portfolio_trails_simulation_df)
#Split tranning and testing data
x_train_poly_d3_log, x_test_poly_d3_log, y_train_poly_d3_log, y_test_poly_d3_log = \
train_test_split(xpoints, ypoints, test_size=0.3, random_state=42)
x_model_validation_poly_d3_log, x_model_testing_poly_d3_log = \
train_test_split(np.linspace(min(xpoints), max(xpoints), len(xpoints)), test_size=0.3, random_state=42)
# model validation data
#x_model_validation = np.linspace(min(x_train), max(x_train), number_of_top_points*3)
# model traning to get parameters
popt_poly_d3_log, pcov_poly_d3_log = curve_fit(model_poly_d3_log, x_train_poly_d3_log,y_train_poly_d3_log, maxfev=50000)
a, b, c, d, e = popt_poly_d3_log
poly_d3_log_form = str('y =%.5f * np.log( %.5f*x) + %.5f * x**3 + %.5f * x + %.5f' % (a, b, c, d, e))
return x_train_poly_d3_log, x_test_poly_d3_log, y_train_poly_d3_log, y_test_poly_d3_log,popt_poly_d3_log, pcov_poly_d3_log, \
x_model_validation_poly_d3_log, x_model_testing_poly_d3_log, model_poly_d3_log, poly_d3_log_form
#x_train_poly_d3_log, x_test_poly_d3_log, y_train_poly_d3_log, y_test_poly_d3_log,popt_poly_d3_log, pcov_poly_d3_log, \
# x_model_validation_poly_d3_log, x_model_testing_poly_d3_log, model_poly_d3_log, poly_d3_log_form = \
# polynomial_degree3_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
# Build the model
def model_poly_d5_log(x, a, b, c):
return a*np.log(abs(b)*x) + c*x**5
def polynomial_degree5_log_model(uncorrelated_weighted_portfolio_trails_simulation_df):
# Load the data : original random portfolios data points
xpoints,ypoints,original_random_sharpe_ratio = efficient_frontiere_optimal_portfolios_model_points(uncorrelated_weighted_portfolio_trails_simulation_df)
#Split tranning and testing data
x_train_poly_d5_log, x_test_poly_d5_log, y_train_poly_d5_log, y_test_poly_d5_log = \
train_test_split(xpoints, ypoints, test_size=0.3, random_state=42)
x_model_validation_poly_d5_log, x_model_testing_poly_d5_log = \
train_test_split(np.linspace(min(xpoints), max(xpoints), len(xpoints)), test_size=0.3, random_state=42)
popt_poly_d5_log, pcov_poly_d5_log = curve_fit(model_poly_d5_log, x_train_poly_d5_log,y_train_poly_d5_log, maxfev=50000)
a, b, c = popt_poly_d5_log
poly_d5_log_form = str('y =%.5f * np.log( %.5f*x) + %.5f * x**5' % (a, b, c))
return x_train_poly_d5_log, x_test_poly_d5_log, y_train_poly_d5_log, y_test_poly_d5_log, popt_poly_d5_log, pcov_poly_d5_log, \
x_model_validation_poly_d5_log, x_model_testing_poly_d5_log, model_poly_d5_log, poly_d5_log_form
#x_train_poly_d5_log, x_test_poly_d5_log, y_train_poly_d5_log, y_test_poly_d5_log, popt_poly_d5_log, pcov_poly_d5_log, \
# x_model_validation_poly_d5_log, x_model_testing_poly_d5_log, model_poly_d5_log, poly_d5_log_form = \
# polynomial_degree5_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
def models_plotting(x, y, uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax, model_form):
#------Random portfolio data plotting
plot_random_portfolios(uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax,'no')
cspl = ax.scatter(x=x, y=y, c=y/x, cmap="viridis",label='Efficient Frontier:\n'+model_form)
#-----------model to approximate
plot_fitted_curve(uncorrelated_weighted_portfolio_trails_simulation_df,fig, ax, label='Fitted Curve', marker= '*', color='red')
ax.legend(bbox_to_anchor=(0.72, 1.38), ncol=1, prop = { "size": 8})
return cspl
def dataframe_clipping(x_σp, y_E_rp, y_E_rp_pred ):
clipped_df = pd.DataFrame({'σp':x_σp,'E_rp':y_E_rp,'y_E_rp_pred':y_E_rp_pred,'error':y_E_rp_pred - y_E_rp})
clipped_df = clipped_df.sort_values(by='error',ascending=False)
clipped_df['y_optimal_E_rp'] = np.where(clipped_df['E_rp'] <= clipped_df['y_E_rp_pred'], clipped_df['E_rp'],clipped_df['y_E_rp_pred'] )
clipped_df['sharpes_ratio'] = clipped_df['y_optimal_E_rp']/clipped_df['σp']
return clipped_df[clipped_df['error'] >= 0]
def model_uperBound_efficient_frontier( uncorrelated_weighted_portfolio_trails_simulation_df, model, model_popt,
ax , mode_form, random_points = 0):
optimal_portfolios_df = uncorrelated_weighted_portfolio_trails_simulation_df
x_σp = uncorrelated_weighted_portfolio_trails_simulation_df['σp']
y_E_rp = uncorrelated_weighted_portfolio_trails_simulation_df['E_rp']
row, col = uncorrelated_weighted_portfolio_trails_simulation_df.shape
#here the original data frame is clipped to eliminate the upper bound Outlier
y_E_rp_pred = model(x_σp, *model_popt)
clipped_df = dataframe_clipping(x_σp, y_E_rp, y_E_rp_pred )
xpoints,ypoints,top_sharpe_ratio_value_points = efficient_frontiere_optimal_portfolios_model_points(clipped_df,7)
#------Random portfolio data plotting
if random_points == 0:
scplt = ax.scatter(clipped_df['σp'], clipped_df['E_rp'], marker="o", c=clipped_df['E_rp']/clipped_df['σp'],
cmap="viridis",label='Random Portfolios')
else:
xrandom_points,yrandom_points,random_sharpe_ratio_value_points = \
efficient_frontiere_optimal_sharpe_ratio_portfolios_model_points(clipped_df,random_points)
scplt = ax.scatter(x=xrandom_points, y=yrandom_points, marker="o", c= random_sharpe_ratio_value_points,
cmap="viridis",label='Random Portfolios')
#efficient frontier plotting
x_model_σp = np.linspace(xpoints.min(), xpoints.max(), row)
y_model_E_rp_pred = model(x_model_σp, *model_popt)
cspl = ax.scatter(x=x_model_σp, y=y_model_E_rp_pred, marker="*", c= y_E_rp_pred/x_model_σp,
cmap="viridis",label='Efficient Frontier:\n'+mode_form)
ax.set_title("Boundary Random portfolios Efficient Frontier")
ax.legend(bbox_to_anchor=(0.72, 1.38), ncol=1, prop = { "size": 8})
return scplt
def evalute_model_parameters(uncorrelated_weighted_portfolio_trails_simulation_df):
#polynoial degree 2 model b * x**2 + a * x + c
x_train_poly_d2, x_test_poly_d2, y_train_poly_d2, y_test_poly_d2,popt_poly_d2, \
pcov_poly_d2, x_model_validation_poly_d2, x_model_testing_poly_d2, model_poly_d2, poly_d2_form = \
polynomial_degree2_model(uncorrelated_weighted_portfolio_trails_simulation_df)
# model parameters
a, b, c = popt_poly_d2
#model prediction
y_model_validation_pred_poly_d2 = model_poly_d2(x_model_validation_poly_d2, a, b, c)
#polynomial degree 3 log model: a * np.log(b * x) + c*x**3 +d*x**2 + e
x_train_poly_d3_log, x_test_poly_d3_log, y_train_poly_d3_log, y_test_poly_d3_log,popt_poly_d3_log, pcov_poly_d3_log, \
x_model_validation_poly_d3_log, x_model_testing_poly_d3_log, model_poly_d3_log, poly_d3_log_form = \
polynomial_degree3_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
# model parameters
a, b, c, d, e = popt_poly_d3_log
y_model_validation_pred_poly_d3_log = model_poly_d3_log(x_model_validation_poly_d3_log, a, abs(b), c, d, e)
#polynomial degree 5 log model: a*np.log(b*x) + c*x**5
x_train_poly_d5_log, x_test_poly_d5_log, y_train_poly_d5_log, y_test_poly_d5_log, popt_poly_d5_log, pcov_poly_d5_log, \
x_model_validation_poly_d5_log, x_model_testing_poly_d5_log, model_poly_d5_log, poly_d5_log_form = \
polynomial_degree5_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
# model parameters
a, b, c = popt_poly_d5_log
y_model_validation_pred_poly_d5_log = model_poly_d5_log(x_model_validation_poly_d5_log, a,abs(b), c)
return popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2,x_model_validation_poly_d3_log, \
x_model_validation_poly_d5_log, y_train_poly_d2, y_model_validation_pred_poly_d2, y_train_poly_d3_log, \
y_model_validation_pred_poly_d3_log, y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, model_poly_d2, \
model_poly_d3_log, model_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form
#popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2,x_model_validation_poly_d3_log, \
#x_model_validation_poly_d5_log, y_train_poly_d2, y_model_validation_pred_poly_d2, y_train_poly_d3_log, \
#y_model_validation_pred_poly_d3_log, y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, model_poly_d2, \
#model_poly_d3_log, model_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form = \
# evalute_model_parameters(uncorrelated_weighted_portfolio_trails_simulation_df)
def model_validation_plotting(uncorrelated_weighted_portfolio_trails_simulation_df):
fig, ax =plt.subplots(2,2,figsize=(13, 13), constrained_layout=True)
popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2,x_model_validation_poly_d3_log, \
x_model_validation_poly_d5_log, y_train_poly_d2, y_model_validation_pred_poly_d2, y_train_poly_d3_log, \
y_model_validation_pred_poly_d3_log, y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, model_poly_d2, \
model_poly_d3_log, model_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form = \
evalute_model_parameters(uncorrelated_weighted_portfolio_trails_simulation_df)
cspl1 = models_plotting(x_model_validation_poly_d2, y_model_validation_pred_poly_d2,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[0,0], poly_d2_form)
cspl2 = models_plotting(x_model_validation_poly_d3_log, y_model_validation_pred_poly_d3_log,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[0,1],poly_d3_log_form)
cspl = models_plotting(x_model_validation_poly_d5_log, y_model_validation_pred_poly_d5_log,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[1,0], poly_d5_log_form)
cplt4 = model_uperBound_efficient_frontier(uncorrelated_weighted_portfolio_trails_simulation_df, model_poly_d2,popt_poly_d2,
ax[1,1], poly_d2_form)
cb = fig.colorbar(cspl, ax=ax, label='Sharpe Ratio',orientation='horizontal',shrink=0.6)
model_validation_plotting(uncorrelated_weighted_portfolio_trails_simulation_df)
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
def fine_tune_hyperparmeters(uncorrelated_weighted_portfolio_trails_simulation_df):
#polynoial degree 2 model b * x**2 + a * x + c
x_train_poly_d2, x_test_poly_d2, y_train_poly_d2, y_test_poly_d2,popt_poly_d2, \
pcov_poly_d2, x_model_validation_poly_d2, x_model_testing_poly_d2, model_poly_d2, poly_d2_form = \
polynomial_degree2_model(uncorrelated_weighted_portfolio_trails_simulation_df)
y_model_turning_pred_poly_d2 = model_poly_d2(x_model_validation_poly_d2, 0.075, -0.019, -0.007)
poly_d2_form = str('y =%.5f * x^2 + %.5f * x + %.5f' % (0.07, -0.016, -0.009))
#polynomial degree 3 log model: a * np.log(b * x) + c*x**3 +d*x**2 + e
x_train_poly_d3_log, x_test_poly_d3_log, y_train_poly_d3_log, y_test_poly_d3_log,popt_poly_d3_log, pcov_poly_d3_log, \
x_model_validation_poly_d3_log, x_model_testing_poly_d3_log, model_poly_d3_log, poly_d3_log_form = \
polynomial_degree3_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
y_model_tuning_pred_poly_d3_log = model_poly_d3_log(x_model_validation_poly_d3_log, 0.256, 0.348, 0.00793, -0.060, 0.343)
poly_d3_log_form = str('y =%.5f * np.log( %.5f*x) + %.5f * x**3 + %.5f * x + %.5f' % (0.256, 0.348, 0.00793, -0.060, 0.343))
#polynomial degree 5 log model: a*np.log(b*x) + c*x**5
x_train_poly_d5_log, x_test_poly_d5_log, y_train_poly_d5_log, y_test_poly_d5_log, popt_poly_d5_log, pcov_poly_d5_log, \
x_model_validation_poly_d5_log, x_model_testing_poly_d5_log, model_poly_d5_log, poly_d5_log_form = \
polynomial_degree5_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
y_model_turning_pred_poly_d5_log = model_poly_d5_log(x_model_validation_poly_d5_log, 0.085, 1.44, -0.00058)
poly_d5_log_form = str('y =%.5f * np.log( %.5f*x) + %.5f * x**5' % (0.085, 1.44, -0.00058))
return model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2, \
x_model_validation_poly_d3_log, x_model_validation_poly_d5_log, y_train_poly_d2, y_model_turning_pred_poly_d2, y_train_poly_d3_log, \
y_model_tuning_pred_poly_d3_log, y_train_poly_d5_log, y_model_turning_pred_poly_d5_log, poly_d2_form, \
poly_d3_log_form, poly_d5_log_form
#model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2, x_model_validation_poly_d3_log, \
#x_model_validation_poly_d5_log, y_train_poly_d2, y_model_tuning_pred_poly_d2, y_train_poly_d3_log, \
#y_model_tuning_pred_poly_d3_log, y_train_poly_d5_log, y_model_tuning_pred_poly_d5_log, poly_d2_form, \
#poly_d3_log_form, poly_d5_log_form= fine_tune_hyperparmeters(uncorrelated_weighted_portfolio_trails_simulation_df)
def model_tuning_plotting(uncorrelated_weighted_portfolio_trails_simulation_df):
fig, ax =plt.subplots(2,2,figsize=(13, 13), constrained_layout=True)
print(" Models Fine-tuning ")
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2, x_model_validation_poly_d3_log, \
x_model_validation_poly_d5_log, y_train_poly_d2, y_model_tuning_pred_poly_d2, y_train_poly_d3_log, \
y_model_tuning_pred_poly_d3_log, y_train_poly_d5_log, y_model_tuning_pred_poly_d5_log, poly_d2_form, \
poly_d3_log_form, poly_d5_log_form= fine_tune_hyperparmeters(uncorrelated_weighted_portfolio_trails_simulation_df)
cspl1 = models_plotting(x_model_validation_poly_d2, y_model_tuning_pred_poly_d2,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[0,0], poly_d2_form)
cspl2 = models_plotting(x_model_validation_poly_d3_log, y_model_tuning_pred_poly_d3_log,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[0,1], poly_d3_log_form)
cspl = models_plotting(x_model_validation_poly_d5_log, y_model_tuning_pred_poly_d5_log,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[1,0], poly_d5_log_form)
cplt4 = model_uperBound_efficient_frontier(uncorrelated_weighted_portfolio_trails_simulation_df,
model_poly_d2,popt_poly_d2, ax[1,1], poly_d2_form,7000)
cb = fig.colorbar(cspl, ax=ax, label='Sharpe Ratio',orientation='horizontal',shrink=0.6)
model_tuning_plotting(uncorrelated_weighted_portfolio_trails_simulation_df)
Models Fine-tuning
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
def test_the_model(uncorrelated_weighted_portfolio_trails_simulation_df):
#polynoial degree 2 model b * x**2 + a * x + c
x_train_poly_d2, x_test_poly_d2, y_train_poly_d2, y_test_poly_d2,popt_poly_d2, \
pcov_poly_d2, x_model_validation_poly_d2, x_model_testing_poly_d2, model_poly_d2, poly_d2_form = \
polynomial_degree2_model(uncorrelated_weighted_portfolio_trails_simulation_df)
y_model_test_pred_poly_d2 = model_poly_d2(x_test_poly_d2, 0.07, -0.016, -0.009)
poly_d2_form = str('y =%.5f * x^2 + %.5f * x + %.5f' % (0.07, -0.016, -0.009))
#polynomial degree 3 log model: a * np.log(b * x) + c*x**3 +d*x**2 + e
x_train_poly_d3_log, x_test_poly_d3_log, y_train_poly_d3_log, y_test_poly_d3_log,popt_poly_d3_log, pcov_poly_d3_log, \
x_model_validation_poly_d3_log, x_model_testing_poly_d3_log, model_poly_d3_log, poly_d3_log_form = \
polynomial_degree3_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
y_model_test_pred_poly_d3_log = model_poly_d3_log(x_test_poly_d3_log, 0.256, 0.348, 0.00793, -0.060, 0.343)
poly_d3_log_form = str('y =%.5f * np.log( %.5f*x) + %.5f * x**3 + %.5f * x + %.5f' % (0.256, 0.348, 0.00793, -0.060, 0.343))
#polynomial degree 5 log model: a*np.log(b*x) + c*x**5
x_train_poly_d5_log, x_test_poly_d5_log, y_train_poly_d5_log, y_test_poly_d5_log, popt_poly_d5_log, pcov_poly_d5_log, \
x_model_validation_poly_d5_log, x_model_testing_poly_d5_log, model_poly_d5_log, poly_d5_log_form = \
polynomial_degree5_log_model(uncorrelated_weighted_portfolio_trails_simulation_df)
y_model_test_pred_poly_d5_log = model_poly_d5_log(x_test_poly_d5_log, 0.085, 1.44, -0.00058)
poly_d5_log_form = str('y =%.5f * np.log( %.5f*x) + %.5f * x**5' % (0.085, 1.44, -0.00058))
return model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_test_poly_d2, x_test_poly_d3_log, x_test_poly_d5_log, \
y_test_poly_d2, y_model_test_pred_poly_d2,y_test_poly_d3_log, y_model_test_pred_poly_d3_log, y_test_poly_d5_log, \
y_model_test_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_test_poly_d2, x_test_poly_d3_log, x_test_poly_d5_log, \
y_test_poly_d2, y_model_test_pred_poly_d2,y_test_poly_d3_log, y_model_test_pred_poly_d3_log, y_test_poly_d5_log, \
y_model_test_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form= \
test_the_model(uncorrelated_weighted_portfolio_trails_simulation_df)
def model_testing_plotting(uncorrelated_weighted_portfolio_trails_simulation_df):
fig, ax =plt.subplots(2,2,figsize=(13, 13), constrained_layout=True)
print(" Model Testing ")
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_test_poly_d2, x_test_poly_d3_log, x_test_poly_d5_log, \
y_test_poly_d2, y_model_test_pred_poly_d2,y_test_poly_d3_log, y_model_test_pred_poly_d3_log, y_test_poly_d5_log, \
y_model_test_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form= \
test_the_model(uncorrelated_weighted_portfolio_trails_simulation_df)
cspl1 = models_plotting(x_test_poly_d2, y_model_test_pred_poly_d2,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[0,0], poly_d2_form)
cspl2 = models_plotting(x_test_poly_d3_log, y_model_test_pred_poly_d3_log,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[0,1], poly_d3_log_form)
cspl = models_plotting(x_test_poly_d5_log, y_model_test_pred_poly_d5_log,
uncorrelated_weighted_portfolio_trails_simulation_df, fig, ax[1,0], poly_d5_log_form)
cplt4 = model_uperBound_efficient_frontier(uncorrelated_weighted_portfolio_trails_simulation_df, \
model_poly_d2,popt_poly_d2, ax[1,1], poly_d2_form)
cb = fig.colorbar(cspl, ax=ax, label='Sharpe Ratio',orientation='horizontal',shrink=0.6)
model_testing_plotting(uncorrelated_weighted_portfolio_trails_simulation_df)
Model Testing
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
array([-0.06841505, 0.23859677, -0.14919578])
0.9642722705586393
def error_metrics_statistics(y_true_0, y_pred_0,y_true_1, y_pred_1,y_true_2, y_pred_2, poly_d2_form, poly_d3_log_form, poly_d5_log_form ):
display('Poly_d2 : '+poly_d2_form)
display('Poly_d3_log: '+poly_d3_log_form)
display('Poly_d5_log: '+poly_d5_log_form)
error_metrics_table = [['Type Error', 'Poly_d2 Error', 'Poly_d3_log Error','Poly_d5_log Error'],
['Mean Absolute Error(MAE)', mean_absolute_error(y_true_0, y_pred_0),mean_absolute_error(y_true_1, y_pred_1),mean_absolute_error(y_true_2, y_pred_2)],
['Mean Absolute Percentage Error(MAPE)', mean_absolute_percentage_error(y_true_0, y_pred_0),mean_absolute_percentage_error(y_true_1, y_pred_1),mean_absolute_percentage_error(y_true_2, y_pred_2)],
['Neg.Mean Squared Error(RMSE)', -mean_squared_error(y_true_0, y_pred_0),-mean_squared_error(y_true_1, y_pred_1),-mean_squared_error(y_true_2, y_pred_2)],
['R-squared score', r2_score(y_true_0, y_pred_0),r2_score(y_true_1, y_pred_1),r2_score(y_true_2, y_pred_2)],
['Mean Squared Error(MSE)',mean_squared_error(y_true_0, y_pred_0),mean_squared_error(y_true_1, y_pred_1),mean_squared_error(y_true_2, y_pred_2)],
['Mean Squared Log Error(MSLE)', mean_squared_log_error(y_true_0, y_pred_0),mean_squared_log_error(y_true_1, y_pred_1),mean_squared_log_error(y_true_2, y_pred_2)]]
return error_metrics_table
def model_residual_metrics(y_train, y_model_validation_pred, y_model_tuning_pred, y_test, y_model_test_pred):
validation_residual = y_train - y_model_validation_pred
residuals_tuning_train = y_train - y_model_tuning_pred
residuals_test = y_test - y_model_test_pred
return validation_residual, residuals_tuning_train, residuals_test
def model_residual_plotting(y_train, y_model_validation_pred, y_model_tuning_pred, y_test, y_model_test_pred, ax, title):
validation_residual, residuals_tuning_train, residuals_test = \
model_residual_metrics(y_train, y_model_validation_pred, y_model_tuning_pred, y_test, y_model_test_pred)
sns.scatterplot(ax=ax,x=y_model_validation_pred, y=validation_residual, label='Validation')
sns.scatterplot(ax=ax,x=y_model_tuning_pred, y=residuals_tuning_train, label='Tuning')
sns.scatterplot(ax=ax,x=y_model_test_pred, y=residuals_test, label='Test')
ax.hlines(0, min(y_model_validation_pred), max(y_model_validation_pred), colors='r', linestyles='dashed')
ax.hlines(0, min(y_model_tuning_pred), max(y_model_tuning_pred), colors='r', linestyles='dashed')
ax.hlines(0, min(y_model_test_pred), max(y_model_test_pred), colors='r', linestyles='dashed')
ax.set_xlabel('Predicted Values')
ax.set_ylabel('Residuals')
ax.set_title(title)
def error_distribution(y_train, y_model_validation_pred, y_model_tuning_pred, y_test, y_model_test_pred, ax, title):
#residual calculation
validation_residual, residuals_tuning_train, residuals_test = \
model_residual_metrics(y_train, y_model_validation_pred, y_model_tuning_pred, y_test, y_model_test_pred)
# Calculate errors
error_validation = -1*validation_residual
tuning_error_train = -1*residuals_tuning_train
error_test = -1*residuals_test
# Plot error distribution
sns.histplot(ax=ax, x=error_validation, kde=True, label='Validation errors', color='blue')
sns.histplot(ax=ax, x=tuning_error_train, kde=True, label='Tuning errors', color='orange')
sns.histplot(ax=ax, x=error_test, kde=True, label='Test errors', color='green')
ax.set_xlabel('Error')
ax.set_ylabel('Frequency')
ax.set_title(title)
ax.legend()
def residual_and_error_plotting(uncorrelated_weighted_portfolio_trails_simulation_df):
fig, ax =plt.subplots(2,3,figsize=(23, 17))
#model validation
popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2,x_model_validation_poly_d3_log, \
x_model_validation_poly_d5_log, y_train_poly_d2, y_model_validation_pred_poly_d2, y_train_poly_d3_log, \
y_model_validation_pred_poly_d3_log, y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, model_poly_d2, \
model_poly_d3_log, model_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form = \
evalute_model_parameters(uncorrelated_weighted_portfolio_trails_simulation_df)
# Model Fine-tuning
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2, \
x_model_validation_poly_d3_log, x_model_validation_poly_d5_log, y_train_poly_d2, y_model_tuning_pred_poly_d2, y_train_poly_d3_log, \
y_model_tuning_pred_poly_d3_log, y_train_poly_d5_log, y_model_tuning_pred_poly_d5_log, poly_d2_form, \
poly_d3_log_form, poly_d5_log_form= fine_tune_hyperparmeters(uncorrelated_weighted_portfolio_trails_simulation_df)
# Model Testing
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_test_poly_d2, x_test_poly_d3_log, x_test_poly_d5_log, \
y_test_poly_d2, y_model_test_pred_poly_d2,y_test_poly_d3_log, y_model_test_pred_poly_d3_log, y_test_poly_d5_log, \
y_model_test_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form= \
test_the_model(uncorrelated_weighted_portfolio_trails_simulation_df)
#-------------------------------------residual plotting---------------------------------------------------------------
#model validation
# poly_d2_residual
model_residual_plotting(y_train_poly_d2, y_model_validation_pred_poly_d2, y_model_tuning_pred_poly_d2,
y_test_poly_d2, y_model_test_pred_poly_d2, ax[0,0], poly_d2_form)
#poly_d3_log
model_residual_plotting(y_train_poly_d3_log, y_model_validation_pred_poly_d3_log, y_model_tuning_pred_poly_d3_log,
y_test_poly_d3_log, y_model_test_pred_poly_d3_log, ax[0,1], poly_d3_log_form)
#poly_d5_log
model_residual_plotting(y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, y_model_tuning_pred_poly_d5_log,
y_test_poly_d5_log, y_model_test_pred_poly_d5_log, ax[0,2], poly_d5_log_form)
#-----------error plotting---------------------------------------------------------------------------------------------
# poly_d2_residual
error_distribution(y_train_poly_d2, y_model_validation_pred_poly_d2, y_model_tuning_pred_poly_d2,
y_test_poly_d2, y_model_test_pred_poly_d2, ax[1,0],poly_d2_form)
#poly_d3_log
error_distribution(y_train_poly_d3_log, y_model_validation_pred_poly_d3_log, y_model_tuning_pred_poly_d3_log,
y_test_poly_d3_log, y_model_test_pred_poly_d3_log, ax[1,1], poly_d3_log_form)
#poly_d5_log
error_distribution(y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, y_model_tuning_pred_poly_d5_log,
y_test_poly_d5_log, y_model_test_pred_poly_d5_log, ax[1,2], poly_d5_log_form)
def model_evalution_report(uncorrelated_weighted_portfolio_trails_simulation_df):
print(" Model Validation ")
popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2,x_model_validation_poly_d3_log, \
x_model_validation_poly_d5_log, y_train_poly_d2, y_model_validation_pred_poly_d2, y_train_poly_d3_log, \
y_model_validation_pred_poly_d3_log, y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, model_poly_d2, \
model_poly_d3_log, model_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form = \
evalute_model_parameters(uncorrelated_weighted_portfolio_trails_simulation_df)
print(tabulate(error_metrics_statistics(y_train_poly_d2, y_model_validation_pred_poly_d2, y_train_poly_d3_log, y_model_validation_pred_poly_d3_log,
y_train_poly_d5_log, y_model_validation_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form), headers='firstrow',
tablefmt='fancy_grid', maxcolwidths=[None, 8]))
print(" Model Fine-tuning ")
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_model_validation_poly_d2, \
x_model_validation_poly_d3_log, x_model_validation_poly_d5_log, y_train_poly_d2, y_model_tuning_pred_poly_d2, y_train_poly_d3_log, \
y_model_tuning_pred_poly_d3_log, y_train_poly_d5_log, y_model_tuning_pred_poly_d5_log, poly_d2_form, \
poly_d3_log_form, poly_d5_log_form= fine_tune_hyperparmeters(uncorrelated_weighted_portfolio_trails_simulation_df)
print(tabulate(error_metrics_statistics(y_train_poly_d2, y_model_tuning_pred_poly_d2, y_train_poly_d3_log, y_model_tuning_pred_poly_d3_log, y_train_poly_d5_log,
y_model_tuning_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form), headers='firstrow',
tablefmt='fancy_grid', maxcolwidths=[None, 8]))
print(" Model Testing ")
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, x_test_poly_d2, x_test_poly_d3_log, x_test_poly_d5_log, \
y_test_poly_d2, y_model_test_pred_poly_d2,y_test_poly_d3_log, y_model_test_pred_poly_d3_log, y_test_poly_d5_log, \
y_model_test_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form= \
test_the_model(uncorrelated_weighted_portfolio_trails_simulation_df)
print(tabulate(error_metrics_statistics(y_test_poly_d2, y_model_test_pred_poly_d2,y_test_poly_d3_log, y_model_test_pred_poly_d3_log, y_test_poly_d5_log,
y_model_test_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form ), headers='firstrow',
tablefmt='fancy_grid', maxcolwidths=[None, 8]))
residual_and_error_plotting(uncorrelated_weighted_portfolio_trails_simulation_df)
model_evalution_report(uncorrelated_weighted_portfolio_trails_simulation_df)
Model Validation
'Poly_d2 : y =0.25607 * x^2 + -0.07319 * x + -0.16664'
'Poly_d3_log: y =-0.01375 * np.log( -4.87781*x) + -0.04449 * x**3 + 0.11653 * x + -0.03184'
'Poly_d5_log: y =0.11275 * np.log( 1.17522*x) + -0.00147 * x**5'
╒══════════════════════════════════════╤═════════════════╤═════════════════════╤═════════════════════╕ │ Type Error │ Poly_d2 Error │ Poly_d3_log Error │ Poly_d5_log Error │ ╞══════════════════════════════════════╪═════════════════╪═════════════════════╪═════════════════════╡ │ Mean Absolute Error(MAE) │ 0.0126922 │ 0.0127389 │ 0.0127086 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Absolute Percentage Error(MAPE) │ 0.326107 │ 0.326331 │ 0.32606 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Neg.Mean Squared Error(RMSE) │ -0.000277596 │ -0.000277636 │ -0.000277634 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ R-squared score │ -0.709529 │ -0.709779 │ -0.709765 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Squared Error(MSE) │ 0.000277596 │ 0.000277636 │ 0.000277634 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Squared Log Error(MSLE) │ 0.000255259 │ 0.000255241 │ 0.000255276 │ ╘══════════════════════════════════════╧═════════════════╧═════════════════════╧═════════════════════╛ Model Fine-tuning
'Poly_d2 : y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'Poly_d3_log: y =0.25600 * np.log( 0.34800*x) + 0.00793 * x**3 + -0.06000 * x + 0.34300'
'Poly_d5_log: y =0.08500 * np.log( 1.44000*x) + -0.00058 * x**5'
╒══════════════════════════════════════╤═════════════════╤═════════════════════╤═════════════════════╕ │ Type Error │ Poly_d2 Error │ Poly_d3_log Error │ Poly_d5_log Error │ ╞══════════════════════════════════════╪═════════════════╪═════════════════════╪═════════════════════╡ │ Mean Absolute Error(MAE) │ 0.0155419 │ 0.0199283 │ 0.0157639 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Absolute Percentage Error(MAPE) │ 0.467216 │ 0.541214 │ 0.440471 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Neg.Mean Squared Error(RMSE) │ -0.000392389 │ -0.000555566 │ -0.000368295 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ R-squared score │ -1.41646 │ -2.42136 │ -1.26808 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Squared Error(MSE) │ 0.000392389 │ 0.000555566 │ 0.000368295 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Squared Log Error(MSLE) │ 0.000358339 │ 0.000502965 │ 0.000335809 │ ╘══════════════════════════════════════╧═════════════════╧═════════════════════╧═════════════════════╛ Model Testing
'Poly_d2 : y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'Poly_d3_log: y =0.25600 * np.log( 0.34800*x) + 0.00793 * x**3 + -0.06000 * x + 0.34300'
'Poly_d5_log: y =0.08500 * np.log( 1.44000*x) + -0.00058 * x**5'
╒══════════════════════════════════════╤═════════════════╤═════════════════════╤═════════════════════╕ │ Type Error │ Poly_d2 Error │ Poly_d3_log Error │ Poly_d5_log Error │ ╞══════════════════════════════════════╪═════════════════╪═════════════════════╪═════════════════════╡ │ Mean Absolute Error(MAE) │ 0.0101095 │ 0.0130635 │ 0.00861755 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Absolute Percentage Error(MAPE) │ 0.262366 │ 0.271526 │ 0.193911 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Neg.Mean Squared Error(RMSE) │ -0.000132733 │ -0.000191517 │ -8.07217e-05 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ R-squared score │ -0.0889735 │ -0.571248 │ 0.33774 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Squared Error(MSE) │ 0.000132733 │ 0.000191517 │ 8.07217e-05 │ ├──────────────────────────────────────┼─────────────────┼─────────────────────┼─────────────────────┤ │ Mean Squared Log Error(MSLE) │ 0.000121714 │ 0.000170487 │ 7.29219e-05 │ ╘══════════════════════════════════════╧═════════════════╧═════════════════════╧═════════════════════╛
def get_wining_model(uncorrelated_weighted_portfolio_trails_simulation_df):
model_poly_d2, model_poly_d3_log, model_poly_d5_log, popt_poly_d2, popt_poly_d3_log, popt_poly_d5_log, \
x_test_poly_d2, x_test_poly_d3_log, x_test_poly_d5_log, y_test_poly_d2, y_model_test_pred_poly_d2, \
y_test_poly_d3_log, y_model_test_pred_poly_d3_log, y_test_poly_d5_log, \
y_model_test_pred_poly_d5_log, poly_d2_form, poly_d3_log_form, poly_d5_log_form= test_the_model(uncorrelated_weighted_portfolio_trails_simulation_df)
return model_poly_d2, popt_poly_d2, poly_d2_form
def plotting_wining_model(uncorrelated_weighted_portfolio_trails_simulation_df, model_poly_d2,popt_poly_d2, poly_d2_form):
fig, ax =plt.subplots(figsize=(9, 7), constrained_layout=True)
cplt = model_uperBound_efficient_frontier(uncorrelated_weighted_portfolio_trails_simulation_df, model_poly_d2,popt_poly_d2, ax, poly_d2_form)
cb = fig.colorbar(cplt, ax=ax, label='Sharpe Ratio',orientation='horizontal',shrink=0.6)
model_poly_d2, popt_poly_d2, poly_d2_form = get_wining_model(uncorrelated_weighted_portfolio_trails_simulation_df)
plotting_wining_model(uncorrelated_weighted_portfolio_trails_simulation_df, model_poly_d2,popt_poly_d2, poly_d2_form)
Here we will use the wining efficient frontier model to predict the portfolio expected return. Then will calculater the portfolio weightsand and investment strategy The following 2 Strategies will be implemented to manage the volatility:
def plotting_selected_efficient_frontier_predicted_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df,risk):
fig, ax =plt.subplots(figsize=(12, 5))
text = ["A", "B", "C", "D", "E", "F"]
model_poly_d2, popt_poly_d2, poly_d2_form = get_wining_model(uncorrelated_weighted_portfolio_trails_simulation_df)
predicted_return = model_poly_d2(risk, *popt_poly_d2)
ax.plot(risk, predicted_return,'*',color='red',label='Optimal portfolios')
scplt = model_uperBound_efficient_frontier(uncorrelated_weighted_portfolio_trails_simulation_df, model_poly_d2,popt_poly_d2, ax, poly_d2_form)
portfolio_annotation(risk, predicted_return, text, ax)
cb = fig.colorbar(scplt, ax=ax, label='Sharpe Ratio')
def predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, risk):
model_poly_d2, popt_poly_d2, poly_d2_form = get_wining_model(uncorrelated_weighted_portfolio_trails_simulation_df)
display(poly_d2_form)
return model_poly_d2(risk, *popt_poly_d2)
pred_portfolio_expected_return = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, 1.3)
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
def get_assets_expected_returns_and_tickers(log_returns, most_diversify_portfolio_assets_list):
uncorrelated_assets_log_returns = log_returns[most_diversify_portfolio_assets_list]
uncorrelated_assets_expected_return = uncorrelated_assets_log_returns.mean()
#display(uncorrelated_assets_espected_return)
#type(uncorrelated_assets_espected_return)
assets_ticker_list = uncorrelated_assets_expected_return.index.tolist()
#display(assets_ticker_list)
assets_expected_returns_list = uncorrelated_assets_expected_return.to_list()
return assets_expected_returns_list, assets_ticker_list
assets_expected_returns_list, assets_ticker_list = get_assets_expected_returns_and_tickers(log_returns,most_diversify_portfolio_assets_list)
def get_portfolio_investment_strategy_df( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk):
sum_weight_and_portfolio_return_list = []
#stocks expected return
assets_expected_returns_list, assets_ticker_list = get_assets_expected_returns_and_tickers(log_returns,most_diversify_portfolio_assets_list)
assets_expected_returns_list = np.array(assets_expected_returns_list)*100
assets_expected_returns_list = list(np.round(assets_expected_returns_list, 3))
#predicted portfolio expected return, given the portfolio volatility(risk)
portfolio_return_predicted_value= round(predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk),3)
#assets expected return absolute deviation from the portfolio expected return
assets_expected_return_absolute_deviation_list = abs(portfolio_return_predicted_value - assets_expected_returns_list)
assets_expected_return_absolute_deviation_list = list(np.round(assets_expected_return_absolute_deviation_list, 3))
sum_expected_return_absolute_deviation = round(sum(assets_expected_return_absolute_deviation_list),3)
#assets weight coefficients list
assets_weight_list = assets_expected_return_absolute_deviation_list/sum_expected_return_absolute_deviation
assets_weight_list = list(np.round(assets_weight_list, 3))
#include the index content into the portfolio strategy data frame
portfolio_content_df = index_content_df[index_content_df['Ticker'].isin(assets_ticker_list)]
#portfolio strategy data frame
portfolio_investment_strategy_df = pd.DataFrame({'Ticker':assets_ticker_list,'Weight':assets_weight_list,
'Asset Espected Returns':assets_expected_returns_list})
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Weight',ascending=True)
#merge content data frame and the weght data frame
portfolio_investment_strategy_df = pd.merge(portfolio_content_df, portfolio_investment_strategy_df, how="inner", on=["Ticker"])
return portfolio_investment_strategy_df
portfolio_investment_strategy_df = get_portfolio_investment_strategy_df( log_returns,
uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, 1.3)
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
def portfolio_annotation(x, y, text, ax):
# Loop for annotation of all points
for i in range(len(x)):
ax.annotate(text[i]+'(σp='+str(round(x[i],3))+';E_rp='+ str(round(y[i],3))+')',
xy=(x[i], y[i]),xycoords='data', xytext= (x[i], y[i] ))
def plot_investment_strategy_pie_chart(log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk, risk_profile = ''):
portfolio_investment_strategy_df = get_portfolio_investment_strategy_df( \
log_returns,uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, portfolio_risk)
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Weight',ascending=True)
industry_labels = portfolio_investment_strategy_df['Industry'].values
sector_labels = portfolio_investment_strategy_df['Sector'].values
weight_values = portfolio_investment_strategy_df['Weight'].values
# Create subplots: use 'domain' type for Pie subplot
fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type':'domain'}]])
fig.add_trace(go.Pie(labels=industry_labels, values=weight_values, name="Industry",
legendgroup="Industry", # this can be any string, not just "group"
legendgrouptitle_text="Industry"), 1, 1)
fig.add_trace(go.Pie(labels=sector_labels, values=weight_values, name="Sector",
legendgroup="Sector", # this can be any string, not just "group"
legendgrouptitle_text="Sector"), 1, 2)
# Use `hole` to create a donut-like pie chart
fig.update_traces(hole=.5, hoverinfo="label+percent+name")
fig.update_layout(
title_text= risk_profile+" Suggested Investment by Industry & Sector",
# Add annotations in the center of the donut pies.
annotations=[dict(text='Industry', x=0.14, y=0.5, font_size=20, showarrow=False),
dict(text='Sector', x=0.84, y=0.5, font_size=20, showarrow=False)],
height=500,
width=800,
autosize=True,
margin=dict(t=0, b=0, l=50, r=0),
legend_tracegroupgap = 0,
legend=dict(
orientation="v",
yanchor="bottom",
y=0,
xanchor="right",
x=1.5),
title=dict(
y=0.9,
x=0.1,
xanchor= 'left',
yanchor= 'top'))
fig.show()
def plot_asset_return_pie_chart(log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk, risk_profile = ''):
portfolio_investment_strategy_df = get_portfolio_investment_strategy_df( \
log_returns,uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, portfolio_risk)
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Asset Espected Returns',ascending=True)
industry_labels = portfolio_investment_strategy_df['Industry'].values
sector_labels = portfolio_investment_strategy_df['Sector'].values
weight_values = portfolio_investment_strategy_df['Asset Espected Returns'].values
# Create subplots: use 'domain' type for Pie subplot
fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type':'domain'}]])
fig.add_trace(go.Pie(labels=industry_labels, values=weight_values, name="Industry",
legendgroup="Industry", # this can be any string, not just "group"
legendgrouptitle_text="Industry"), 1, 1)
fig.add_trace(go.Pie(labels=sector_labels, values=weight_values, name="Sector",
legendgroup="Sector", # this can be any string, not just "group"
legendgrouptitle_text="Sector"), 1, 2)
# Use `hole` to create a donut-like pie chart
fig.update_traces(hole=.5, hoverinfo="label+percent+name")
fig.update_layout(
title_text=risk_profile+" Asset Returns by Industry & Sector",
# Add annotations in the center of the donut pies.
annotations=[dict(text='Industry', x=0.14, y=0.5, font_size=20, showarrow=False),
dict(text='Sector', x=0.84, y=0.5, font_size=20, showarrow=False)],
height=500,
width=800,
autosize=True,
margin=dict(t=0, b=0, l=50, r=0),
legend_tracegroupgap = 0,
legend=dict(
orientation="v",
yanchor="bottom",
y=0,
xanchor="right",
x=1.5),
title=dict(
y=0.9,
x=0.1,
xanchor= 'left',
yanchor= 'top'))
fig.show()
#Finding weights of portfolio when return given
def plot_asset_return( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk,risk_profile = ''):
fig, ax =plt.subplots(figsize=(12, 6))
#plotting Asset Espected Returns
portfolio_investment_strategy_df = get_portfolio_investment_strategy_df( \
log_returns,uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, portfolio_risk)
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Asset Espected Returns',ascending=True)
column_list = [': ' for i in range(len(portfolio_investment_strategy_df))]
column_df = pd.DataFrame({'colum': column_list})
asset_return = portfolio_investment_strategy_df['Asset Espected Returns']
strategy_Tickers = portfolio_investment_strategy_df['Sector'] + column_df['colum'] + \
portfolio_investment_strategy_df['Industry'] + column_df['colum'] + \
portfolio_investment_strategy_df['Company'] + \
column_df['colum'] + portfolio_investment_strategy_df['Ticker']
bar_container= ax.barh(strategy_Tickers, asset_return*100)
ax.axes.get_xaxis().set_visible(False)
# setting label of y-axis
ax.set_ylabel("Asset Tickers")
# setting label of x-axis
ax.set_xlabel("Asset Return")
ax.set_title(risk_profile+" Asset Return",fontsize=22, horizontalalignment='right',fontweight='roman')
ax.bar_label(bar_container, fmt='{:,.1f}%')
plt.show()
#Asset return pie chart
plot_asset_return_pie_chart( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk, risk_profile)
#Finding weights of portfolio when return given
def plot_predicted_portfolio_weight( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk, risk_profile = ''):
fig, ax =plt.subplots(figsize=(12, 6))
portfolio_investment_strategy_df = get_portfolio_investment_strategy_df( \
log_returns,uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, portfolio_risk)
portfolio_investment_strategy_df = portfolio_investment_strategy_df.sort_values(by='Weight',ascending=True)
column_list = [': ' for i in range(len(portfolio_investment_strategy_df))]
column_df = pd.DataFrame({'colum': column_list})
#plotting
display(portfolio_investment_strategy_df.style.hide(axis='index'))
strategy_Weight = portfolio_investment_strategy_df['Weight']
strategy_Tickers = portfolio_investment_strategy_df['Sector'] + column_df['colum'] + \
portfolio_investment_strategy_df['Industry'] + column_df['colum'] + \
portfolio_investment_strategy_df['Company'] + \
column_df['colum'] + portfolio_investment_strategy_df['Ticker']
bar_container= ax.barh(strategy_Tickers, strategy_Weight*100)
ax.axes.get_xaxis().set_visible(False)
#setting label of y-axis
ax.set_ylabel("Asset Tickers")
# setting label of x-axis
ax.set_xlabel("Portfolio Weight")
ax.set_title(risk_profile+" suggested Portfolio Allocation", fontsize=22, horizontalalignment='right')
ax.bar_label(bar_container, fmt='{:,.1f}%')
plt.show()
#Investement strategy pie chart
plot_investment_strategy_pie_chart( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk, risk_profile)
def get_portolio_risk_input(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk):
predited_portfolio_return = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk)
prediction_df = pd.DataFrame([{'portfolio_risk':portfolio_risk,'Predited Portfolio Return':predited_portfolio_return}])
display(prediction_df.style.hide(axis='index'))
def plot_risk_tolerence_treshold(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk = 1.3):
risk_tolerence_threshold_df = risk_tolerence_threshold(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk)
print('\n **********************************************************************\n'+
' Optimal Portfolio Table - Winning Model and Efficient Frontier\n'+
' **********************************************************************\n')
display(risk_tolerence_threshold_df)
portfolio_risk_values = risk_tolerence_threshold_df['Portfolio Risk(volatility)'].values
plotting_selected_efficient_frontier_predicted_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df,portfolio_risk_values)
def risk_tolerence_threshold(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk = 1.3):
#define threshold to track the investor risk tolerence:High risk tolerance (aggressive investors), Moderate risk tolerance (moderate investors)
#Low risk tolerance (conservative investors)
pred_random_portfolio_return = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk)
max_E_rp_sharpe_ratio, max_E_rp, max_E_rp_σp = get_maximun_return_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df)
pred_maximun_return_portfolio = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, max_E_rp_σp)
max_σp_E_rp_sharpe_ratio, max_σp_E_rp, max_σp = get_maximun_risk_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df)
pred_maximun_risk_portfolio = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, max_σp)
maximum_sharpe_ratio, maximum_sharpe_ratio_σp_E_rp, maximum_sharpe_ratio_σp = get_maximum_sharpe_ratio(uncorrelated_weighted_portfolio_trails_simulation_df)
pred_maximum_sharpe_ratio = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, maximum_sharpe_ratio_σp)
minimum_σp_E_rp_sharpe_ratio, minimum_σp_E_rp, minimum_σp = get_minimum_risk_portfolio(uncorrelated_weighted_portfolio_trails_simulation_df)
pred_minimum_risk_portfolio = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, minimum_σp)
avg_risk = uncorrelated_weighted_portfolio_trails_simulation_df['σp'].mean()
pred_avg_risk_Expected_return = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, avg_risk)
index = ['A', 'B', 'C', 'D', 'E', 'F']
risk_tolerence_threshold_df =pd.DataFrame({'Portfolio Type': ['Random Portfolio', 'Maximun Return Portfolio','Maximun Risk Portfolio',
'Maximum Sharpe Ratio(Tangent Portfolio)',
'Minimum Risk Portfolio', 'Average Volatilty'],
'Predicted Expected Return': [pred_random_portfolio_return, pred_maximun_return_portfolio, pred_maximun_risk_portfolio,
pred_maximum_sharpe_ratio, pred_minimum_risk_portfolio, pred_avg_risk_Expected_return ],
'Portfolio Risk(volatility)':[portfolio_risk, max_E_rp_σp, max_σp, maximum_sharpe_ratio_σp, minimum_σp, avg_risk],
'Sharpe Ratio':[pred_random_portfolio_return/portfolio_risk, pred_maximun_return_portfolio/maximum_sharpe_ratio_σp,
pred_maximun_risk_portfolio/max_σp, pred_maximum_sharpe_ratio/maximum_sharpe_ratio_σp ,
pred_minimum_risk_portfolio/ minimum_σp, pred_avg_risk_Expected_return/avg_risk]},
index=index)
return risk_tolerence_threshold_df
def plot_suggested_portfolio_structure( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk):
risk_tolerence_threshold_df = risk_tolerence_threshold(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk)
for i in range(len(risk_tolerence_threshold_df)):
portfolio_risk = risk_tolerence_threshold_df['Portfolio Risk(volatility)'][i]
predicted_expected_return = risk_tolerence_threshold_df['Predicted Expected Return'][i]
sharpe_ratio = risk_tolerence_threshold_df['Sharpe Ratio'][i]
print('\n *************************************\n'+
' Portfolio Risk(volatility) : '+str(round(portfolio_risk,3))+'\n'+
' Predicted Expected Return : '+str(round(predicted_expected_return,3))+'\n'+
' Sharpe Ratio : '+str(round(sharpe_ratio,3))+'\n'
' *************************************\n')
plot_predicted_portfolio_weight( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk)
plot_asset_return( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df, most_diversify_portfolio_assets_list, portfolio_risk)
def risk_tolerence_encoding(uncorrelated_weighted_portfolio_trails_simulation_df):
risk_tolerence_threshold_df = risk_tolerence_threshold(uncorrelated_weighted_portfolio_trails_simulation_df)
max_Erp = risk_tolerence_threshold_df['Portfolio Risk(volatility)']['B']
max_riskp = risk_tolerence_threshold_df['Portfolio Risk(volatility)']['C']
max_shape_ratiop = risk_tolerence_threshold_df['Portfolio Risk(volatility)']['D']
min_riskp = risk_tolerence_threshold_df['Portfolio Risk(volatility)']['E']
avg_riskp = risk_tolerence_threshold_df['Portfolio Risk(volatility)']['F']
#σp E_rp
simulated_risk_list = uncorrelated_weighted_portfolio_trails_simulation_df['σp']
pred_Expected_return_list = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, simulated_risk_list)
sharpe_ratio_list =pred_Expected_return_list/simulated_risk_list
risk_profile_list = []
risk_profile_encoding_list = []
for i in range(len(simulated_risk_list)):
portfolio_risk = simulated_risk_list[i]
if portfolio_risk >= max_shape_ratiop and portfolio_risk <=avg_riskp :
risk_profile_list.append('Moderate')
risk_profile_encoding_list.append(1)
elif portfolio_risk <max_shape_ratiop:
risk_profile_list.append('Conservative')
risk_profile_encoding_list.append(2)
elif portfolio_risk > avg_riskp:
risk_profile_list.append('Aggressive')
risk_profile_encoding_list.append(3)
risk_tolerence_rating_df= pd.DataFrame({'Simulated Risk': simulated_risk_list, 'Predicted Expected Return':pred_Expected_return_list,
'Sharpe Ratio':sharpe_ratio_list, 'Risk Profile':risk_profile_list,
'Risk Profile Encoding Value':risk_profile_encoding_list})
return risk_tolerence_rating_df
risk_tolerence_encoding_df = risk_tolerence_encoding(uncorrelated_weighted_portfolio_trails_simulation_df)
display(risk_tolerence_encoding_df)
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Simulated Risk | Predicted Expected Return | Sharpe Ratio | Risk Profile | Risk Profile Encoding Value | |
|---|---|---|---|---|---|
| 0 | 1.591041 | 0.055515 | 0.034892 | Aggressive | 3 |
| 1 | 1.461517 | 0.051284 | 0.035090 | Aggressive | 3 |
| 2 | 1.867592 | 0.056328 | 0.030161 | Aggressive | 3 |
| 3 | 1.423179 | 0.049561 | 0.034824 | Moderate | 1 |
| 4 | 1.282241 | 0.041377 | 0.032270 | Moderate | 1 |
| ... | ... | ... | ... | ... | ... |
| 9995 | 1.279847 | 0.041213 | 0.032202 | Moderate | 1 |
| 9996 | 1.345417 | 0.045405 | 0.033748 | Moderate | 1 |
| 9997 | 1.636396 | 0.056416 | 0.034476 | Aggressive | 3 |
| 9998 | 1.413588 | 0.049097 | 0.034732 | Moderate | 1 |
| 9999 | 1.380792 | 0.047406 | 0.034332 | Moderate | 1 |
10000 rows × 5 columns
def get_risk_profile_matrix(uncorrelated_weighted_portfolio_trails_simulation_df):
risk_tolerence_encoding_df = risk_tolerence_encoding(uncorrelated_weighted_portfolio_trails_simulation_df)
risk_profile_matrix = risk_tolerence_encoding_df.groupby('Risk Profile')[['Simulated Risk','Predicted Expected Return','Sharpe Ratio']].mean()
return pd.DataFrame(risk_profile_matrix)
risk_profile_matrix = get_risk_profile_matrix(uncorrelated_weighted_portfolio_trails_simulation_df)
risk_profile_matrix
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Simulated Risk | Predicted Expected Return | Sharpe Ratio | |
|---|---|---|---|
| Risk Profile | |||
| Aggressive | 1.523578 | 0.053271 | 0.034979 |
| Conservative | 1.235111 | 0.037888 | 0.030628 |
| Moderate | 1.371376 | 0.046754 | 0.034070 |
def plot_suggested_risk_profile_portfolio_structure( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk=1.3):
#risk_tolerence_threshold_df = risk_tolerence_threshold(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk)
risk_profile_matrix = get_risk_profile_matrix(uncorrelated_weighted_portfolio_trails_simulation_df)
print('\n **********************************************************************\n'+
' Investment Profile Simulation And Portfolio Allocation \n'+
' **********************************************************************\n')
display(risk_profile_matrix)
for i in range(len(risk_profile_matrix)):
risk_profile = risk_profile_matrix.index[i]
portfolio_risk = risk_profile_matrix['Simulated Risk'][i]
predicted_expected_return = risk_profile_matrix['Predicted Expected Return'][i]
sharpe_ratio = risk_profile_matrix['Sharpe Ratio'][i]
print('\n *****************************************************\n'+
' Risk Profile : '+risk_profile+' Investment \n'+
' Simulated Risk : '+str(round(portfolio_risk,3))+'\n'+
' Predicted Expected Return : '+str(round(predicted_expected_return,3))+'\n'+
' Sharpe Ratio : '+str(round(sharpe_ratio,3))+'\n'
' *****************************************************\n')
plot_predicted_portfolio_weight( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk,risk_profile)
plot_asset_return( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk,risk_profile)
plot_risk_tolerence_treshold(uncorrelated_weighted_portfolio_trails_simulation_df)
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
**********************************************************************
Optimal Portfolio Table - Winning Model and Efficient Frontier
**********************************************************************
| Portfolio Type | Predicted Expected Return | Portfolio Risk(volatility) | Sharpe Ratio | |
|---|---|---|---|---|
| A | Random Portfolio | 0.042569 | 1.300000 | 0.032745 |
| B | Maximun Return Portfolio | 0.057326 | 1.767518 | 0.044862 |
| C | Maximun Risk Portfolio | 0.037199 | 2.274133 | 0.016357 |
| D | Maximum Sharpe Ratio(Tangent Portfolio) | 0.041076 | 1.277847 | 0.032144 |
| E | Minimum Risk Portfolio | 0.022132 | 1.055724 | 0.020964 |
| F | Average Volatilty | 0.050252 | 1.437986 | 0.034946 |
plot_suggested_portfolio_structure( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, 1.3)
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*************************************
Portfolio Risk(volatility) : 1.3
Predicted Expected Return : 0.043
Sharpe Ratio : 0.033
*************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.016000 | 0.047000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.023000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.047000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.051000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.066000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.074000 | 0.024000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.113000 | 0.014000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.121000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.125000 | 0.075000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.168000 | 0.000000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.195000 | 0.093000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*************************************
Portfolio Risk(volatility) : 1.768
Predicted Expected Return : 0.057
Sharpe Ratio : 0.045
*************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.031000 | 0.047000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.053000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.057000 | 0.075000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.063000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.082000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.085000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.097000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.104000 | 0.024000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.113000 | 0.093000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.135000 | 0.014000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.179000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*************************************
Portfolio Risk(volatility) : 2.274
Predicted Expected Return : 0.037
Sharpe Ratio : 0.016
*************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.000000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.025000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.029000 | 0.030000 |
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.042000 | 0.047000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.046000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.055000 | 0.024000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.097000 | 0.014000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.155000 | 0.074000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.155000 | 0.000000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.160000 | 0.075000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.235000 | 0.093000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*************************************
Portfolio Risk(volatility) : 1.278
Predicted Expected Return : 0.041
Sharpe Ratio : 0.032
*************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.016000 | 0.037000 |
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.024000 | 0.047000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.040000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.044000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.060000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.068000 | 0.024000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.108000 | 0.014000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.132000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.136000 | 0.075000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.164000 | 0.000000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.208000 | 0.093000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*************************************
Portfolio Risk(volatility) : 1.056
Predicted Expected Return : 0.022
Sharpe Ratio : 0.021
*************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.007000 | 0.024000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.015000 | 0.026000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.030000 | 0.030000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.030000 | 0.014000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.033000 | 0.031000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.056000 | 0.037000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.082000 | 0.000000 |
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.093000 | 0.047000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.193000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.197000 | 0.075000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.264000 | 0.093000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*************************************
Portfolio Risk(volatility) : 1.438
Predicted Expected Return : 0.05
Sharpe Ratio : 0.035
*************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.011000 | 0.047000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.046000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.067000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.071000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.085000 | 0.026000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.085000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.088000 | 0.075000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.092000 | 0.024000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.127000 | 0.014000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.152000 | 0.093000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.177000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
plot_suggested_risk_profile_portfolio_structure( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk=1.3)
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
**********************************************************************
Investment Profile Simulation And Portfolio Allocation
**********************************************************************
| Simulated Risk | Predicted Expected Return | Sharpe Ratio | |
|---|---|---|---|
| Risk Profile | |||
| Aggressive | 1.523578 | 0.053271 | 0.034979 |
| Conservative | 1.235111 | 0.037888 | 0.030628 |
| Moderate | 1.371376 | 0.046754 | 0.034070 |
*****************************************************
Risk Profile : Aggressive Investment
Simulated Risk : 1.524
Predicted Expected Return : 0.053
Sharpe Ratio : 0.035
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.023000 | 0.047000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.056000 | 0.037000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.066000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.069000 | 0.075000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.076000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.079000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.092000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.099000 | 0.024000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.129000 | 0.093000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.132000 | 0.014000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.178000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*****************************************************
Risk Profile : Conservative Investment
Simulated Risk : 1.235
Predicted Expected Return : 0.038
Sharpe Ratio : 0.031
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.004000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.029000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.033000 | 0.030000 |
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.037000 | 0.047000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.050000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.058000 | 0.024000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.100000 | 0.014000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.149000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.154000 | 0.075000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.158000 | 0.000000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.228000 | 0.093000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*****************************************************
Risk Profile : Moderate Investment
Simulated Risk : 1.371
Predicted Expected Return : 0.047
Sharpe Ratio : 0.034
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.000000 | 0.047000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.037000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.060000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.063000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.078000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.086000 | 0.024000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.101000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.104000 | 0.075000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.123000 | 0.014000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.172000 | 0.093000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.175000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
My main objective is to use K-means clustering to found out the investment type of risk tolerance: very conservative, conservative, moderate, aggressive and very aggressive. From the Elbow model, I will assume the optimal number of clusters is 5
In this section, I will combine K-means clustering with efficient frontier modeling to dig into the random generated portfolios. In order to simulate the investors risk tolerance, I will use K-means clustering and optimal portfolio modeling on top of the previous covariance matrix technique that I used along with the covariance threshold to reduce the volume of the assets. I used the covariance coefficient to filter uncorrelated assets. Then I will recommend an investment strategy that optimize return for each type for risk tolerance. I'm using the wining model 'y =0.07000 x^2 + -0.01600 x + -0.00900' to predict the portfolio expected returns. The simulated portfolio risk is combined with the simulated the portfolio expected return and the predicted expected return to set the random efficient frontier data. The random efficient frontier data is then used as input for the K-means cluster models. The cluster centroid corresponding to each type of risk tolerance is projected on the winning efficient frontier model in order to find out the optimal portfolio that corresponds to each type of risk tolerance.
def calculate_number_of_cluster(uncorrelated_weighted_portfolio_trails_simulation_df, n_components, ax):
# Randomn efficient frontier data collection
portfolio_risk_list = uncorrelated_weighted_portfolio_trails_simulation_df['σp']
portefolio_return_list = uncorrelated_weighted_portfolio_trails_simulation_df['E_rp']
predited_portfolio_return_list = predict_portfolio_expectded_return(uncorrelated_weighted_portfolio_trails_simulation_df, portfolio_risk_list)
clipped_df = dataframe_clipping(portfolio_risk_list, portefolio_return_list, predited_portfolio_return_list )
range_nbr_clusters = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
# Step 2: Standardize the data
scaler = StandardScaler()
scaled_efficient_Frontier_data = scaler.fit_transform(clipped_df)
#Determine the Number of clusters using Within Cluster Sum of Squares(wcss)
wcss = [] # (Within Cluster Sum of Squares:inertia)
silhouette_average_list = []
for n2 in range_nbr_clusters:
kmeans = KMeans(n_clusters=n2, init ='k-means++', max_iter=300, n_init=10,random_state=0 )
kmeans.fit(clipped_df)
wcss.append(kmeans.inertia_)
cluster_labels = kmeans.fit_predict(clipped_df)
silhouette_average_list.append(silhouette_score(clipped_df, cluster_labels))
ax1 = ax.twinx()
ax.plot(range_nbr_clusters, wcss, 'b-', marker='o')
ax1.plot(range_nbr_clusters,silhouette_average_list, 'g-', marker='o')
ax.set_xlabel('Number of Clusters')
ax.set_ylabel('Within Cluster Sum of Squares(wcss)')
ax1.set_ylabel('Silhouette score')
ax.set_title('Elbow Method & Silhouette Analysis for Optimal Number of Clusters')
#plt.show()
return clipped_df
def implement_k_means_clusters(uncorrelated_weighted_portfolio_trails_simulation_df, clipped_df, fig, ax):
# our main objective is to use K-means clustering to found out investment risk profile: very conservative, conservative, moderate, aggressive and very aggressive.
# So from the Elbow model, let's assume the optimal number of clusters is 5
kmeans = KMeans(n_clusters=5, init ='k-means++', max_iter=300, n_init=10,random_state=0 )
#clusters = kmeans.fit_predict(pca_data)
pred_clusters = kmeans.fit_predict(clipped_df)
rand_data_point_and_cluster_df = clipped_df
rand_data_point_and_cluster_df['cluster'] = pred_clusters
#investment profile
investment_profiles_index = ['Moderate', 'Conservative', 'Agressive', 'Very Aggressive', 'Very Conservative']
investment_profiles_color = ['purple', 'gold', 'limegreen', 'green', 'yellow']
display(rand_data_point_and_cluster_df)
#plot cluster
for i in range(len(investment_profiles_index)):
cspl = ax.scatter(x=rand_data_point_and_cluster_df.loc[(rand_data_point_and_cluster_df['cluster'] ==i), ['σp']],
y=rand_data_point_and_cluster_df.loc[(rand_data_point_and_cluster_df['cluster'] ==i), ['E_rp']],
c= investment_profiles_color[i], cmap="viridis",label=investment_profiles_index[i])
# find clusters centratides
cluster_centers_df = pd.DataFrame(kmeans.cluster_centers_)
cluster_centers_df = cluster_centers_df.set_axis( kmeans.feature_names_in_ , axis=1)
cluster_centers_df.index = investment_profiles_index
cluster_centers_df.index.names = ['Investment Profile']
#plot clusters centroid.
ax.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], marker=".", s=100, c='red', label = 'Cluster Centroids')
#plot efficienr frontier model
model_poly_d2, popt_poly_d2, poly_d2_form = get_wining_model(uncorrelated_weighted_portfolio_trails_simulation_df)
xpoints,ypoints,top_sharpe_ratio_value_points = efficient_frontiere_optimal_portfolios_model_points(clipped_df,7)
row, col = uncorrelated_weighted_portfolio_trails_simulation_df.shape
x_model_σp = np.linspace(xpoints.min(), xpoints.max(), row)
y_model_E_rp_pred = model_poly_d2(x_model_σp, *popt_poly_d2)
scplt = ax.scatter(x=x_model_σp, y=y_model_E_rp_pred, marker="D", c= y_model_E_rp_pred/x_model_σp,
cmap="viridis",label='Efficient Frontier:\n'+poly_d2_form)
cb = fig.colorbar(scplt, ax=ax, label='Sharpe Ratio')
#find model predicted centroide expected return
pred_centroide_Expr_list = []
pred_centroide_Expr_list = model_poly_d2(cluster_centers_df['σp'], *popt_poly_d2)
cluster_centers_df['Pred Centroide Expr'] =pred_centroide_Expr_list
cluster_centers_df['Pred Centroide Sharpe Ratio'] =pred_centroide_Expr_list/cluster_centers_df['σp']
display(cluster_centers_df)
#plotting model centroide
ax.scatter(cluster_centers_df['σp'], pred_centroide_Expr_list, marker=".", s=100, c='blue', label = 'Model Centroids')
ax.set_title('Simulated Porfolio Clusters')
ax.set_xlabel('Volatility(Risk)')
ax.set_ylabel('Expected Return ')
ax.legend(prop = { "size": 8 })
plt.show()
# Silhouette Score to evaluate the clustering
sil_score = silhouette_score(clipped_df, pred_clusters)
print(f'Silhouette Score: {sil_score}')
return cluster_centers_df
def plot_predicted_clusters_risk_profile_portfolio_allocation( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, cluster_centers_df):
print('\n **********************************************************************\n'+
' Investment Profile Simulation And Portfolio Allocation \n'+
' **********************************************************************\n')
for i in range(len(cluster_centers_df)):
risk_profile = cluster_centers_df.index[i]
portfolio_risk = cluster_centers_df['σp'][i]
predicted_expected_return = cluster_centers_df['Pred Centroide Expr'][i]
sharpe_ratio = cluster_centers_df['Pred Centroide Sharpe Ratio'][i]
print('\n *****************************************************\n'+
' Risk Profile : '+risk_profile+' Investment \n'+
' Simulated Risk : '+str(round(portfolio_risk,3))+'\n'+
' Predicted Expected Return : '+str(round(predicted_expected_return,3))+'\n'+
' Sharpe Ratio : '+str(round(sharpe_ratio,3))+'\n'
' *****************************************************\n')
plot_predicted_portfolio_weight( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk,risk_profile)
plot_asset_return( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, portfolio_risk,risk_profile)
def implement_investement_profile_simulation(log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, n_components):
warnings.filterwarnings("ignore")
fig, ax =plt.subplots(1,2,figsize=(21, 5))
clipped_df = calculate_number_of_cluster(uncorrelated_weighted_portfolio_trails_simulation_df,n_components, ax[0])
print('\n *********************************************************************************\n'+
' Investement profile simulation - Optimal Portfolio - Efficient Frontier Model \n'+
' *********************************************************************************\n')
cluster_centers_df = implement_k_means_clusters(uncorrelated_weighted_portfolio_trails_simulation_df, clipped_df, fig, ax[1])
plot_predicted_clusters_risk_profile_portfolio_allocation( log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, cluster_centers_df)
implement_investement_profile_simulation(log_returns, uncorrelated_weighted_portfolio_trails_simulation_df,
most_diversify_portfolio_assets_list, 2)
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*********************************************************************************
Investement profile simulation - Optimal Portfolio - Efficient Frontier Model
*********************************************************************************
| σp | E_rp | y_E_rp_pred | error | y_optimal_E_rp | sharpes_ratio | cluster | |
|---|---|---|---|---|---|---|---|
| 3536 | 1.480160 | 0.031575 | 0.052045 | 0.020469 | 0.031575 | 0.021332 | 4 |
| 861 | 1.538196 | 0.033908 | 0.054085 | 0.020177 | 0.033908 | 0.022044 | 1 |
| 1686 | 1.513095 | 0.033426 | 0.053263 | 0.019837 | 0.033426 | 0.022091 | 1 |
| 7572 | 1.458223 | 0.031368 | 0.051145 | 0.019777 | 0.031368 | 0.021511 | 4 |
| 8227 | 1.527153 | 0.033984 | 0.053735 | 0.019751 | 0.033984 | 0.022253 | 1 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 9656 | 1.213137 | 0.036272 | 0.036302 | 0.000031 | 0.036272 | 0.029899 | 0 |
| 252 | 1.259952 | 0.039787 | 0.039817 | 0.000030 | 0.039787 | 0.031578 | 0 |
| 8959 | 1.337616 | 0.044917 | 0.044940 | 0.000023 | 0.044917 | 0.033580 | 2 |
| 4610 | 1.317238 | 0.043660 | 0.043681 | 0.000020 | 0.043660 | 0.033145 | 0 |
| 2529 | 1.221793 | 0.036966 | 0.036977 | 0.000011 | 0.036966 | 0.030255 | 0 |
9897 rows × 7 columns
| σp | E_rp | y_E_rp_pred | error | y_optimal_E_rp | sharpes_ratio | Pred Centroide Expr | Pred Centroide Sharpe Ratio | |
|---|---|---|---|---|---|---|---|---|
| Investment Profile | ||||||||
| Moderate | 1.281066 | 0.034786 | 0.041189 | 0.006403 | 0.034786 | 0.027137 | 0.041297 | 0.032236 |
| Conservative | 1.537969 | 0.044667 | 0.054027 | 0.009360 | 0.044667 | 0.029042 | 0.054078 | 0.035162 |
| Agressive | 1.376557 | 0.038718 | 0.047131 | 0.008413 | 0.038718 | 0.028125 | 0.047176 | 0.034271 |
| Very Aggressive | 1.643127 | 0.048177 | 0.056376 | 0.008199 | 0.048177 | 0.029321 | 0.056524 | 0.034400 |
| Very Conservative | 1.455492 | 0.041679 | 0.050990 | 0.009312 | 0.041679 | 0.028633 | 0.051028 | 0.035059 |
Silhouette Score: 0.9673068137975777
**********************************************************************
Investment Profile Simulation And Portfolio Allocation
**********************************************************************
*****************************************************
Risk Profile : Moderate Investment
Simulated Risk : 1.281
Predicted Expected Return : 0.041
Sharpe Ratio : 0.032
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.016000 | 0.037000 |
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.024000 | 0.047000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.040000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.044000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.060000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.068000 | 0.024000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.108000 | 0.014000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.132000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.136000 | 0.075000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.164000 | 0.000000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.208000 | 0.093000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*****************************************************
Risk Profile : Conservative Investment
Simulated Risk : 1.538
Predicted Expected Return : 0.054
Sharpe Ratio : 0.035
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.023000 | 0.047000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.056000 | 0.037000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.066000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.069000 | 0.075000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.076000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.079000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.092000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.099000 | 0.024000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.129000 | 0.093000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.132000 | 0.014000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.178000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*****************************************************
Risk Profile : Agressive Investment
Simulated Risk : 1.377
Predicted Expected Return : 0.047
Sharpe Ratio : 0.034
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.000000 | 0.047000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.037000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.060000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.063000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.078000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.086000 | 0.024000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.101000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.104000 | 0.075000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.123000 | 0.014000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.172000 | 0.093000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.175000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*****************************************************
Risk Profile : Very Aggressive Investment
Simulated Risk : 1.643
Predicted Expected Return : 0.057
Sharpe Ratio : 0.034
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.031000 | 0.047000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.053000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.057000 | 0.075000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.063000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.082000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.085000 | 0.030000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.097000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.104000 | 0.024000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.113000 | 0.093000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.135000 | 0.014000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.179000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
*****************************************************
Risk Profile : Very Conservative Investment
Simulated Risk : 1.455
Predicted Expected Return : 0.051
Sharpe Ratio : 0.035
*****************************************************
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
| Ticker | Company | Sector | Industry | Weight | Asset Espected Returns |
|---|---|---|---|---|---|
| BN | Brookfield Corporation | Financial Services | Asset Management | 0.014000 | 0.047000 |
| ENB | Enbridge Inc. | Energy | Oil & Gas Storage/Transport | 0.049000 | 0.037000 |
| PEY | Peyto Exploration & Development Corp. | Energy | Oil & Gas Exploration and Production | 0.069000 | 0.031000 |
| BMO | Bank of Montreal | Financial Services | Banks | 0.073000 | 0.030000 |
| NGD | New Gold Inc. | Basic Materials | Metals & Mining | 0.080000 | 0.074000 |
| IGM | IGM Financial Inc. | Financial Services | Asset Management | 0.083000 | 0.075000 |
| DOL | Dollarama Inc. | Consumer Defensive | Retail Defensive | 0.087000 | 0.026000 |
| TD | Toronto-Dominion Bank | Financial Services | Banks | 0.094000 | 0.024000 |
| DOO | BRP Inc. | Consumer Cyclical | Vehicles & Parts | 0.128000 | 0.014000 |
| CNQ | Canadian Natural Resources Limited | Energy | Oil & Gas Exploration and Production | 0.146000 | 0.093000 |
| TVE | Tamarack Valley Energy Ltd. | Energy | Oil & Gas Exploration and Production | 0.177000 | 0.000000 |
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
'y =0.07000 * x^2 + -0.01600 * x + -0.00900'
In this section, we will use Statistic Canad's API stats-can to integrate the Canadian economic factors. We will then use Principal Components Analysis(PCA) technique to select the most importance economic factors.
warnings.filterwarnings("ignore")
# --------------------------------------------------------------------------------------------
# Trade Balance: Labour force characteristics by province, monthly, seasonally adjusted
# --------------------------------------------------------------------------------------------
def get_trade_balance_rate(reporting_year_period, frequency_date_column ):
frequency = frequency_date_column[0].upper()
df = sc.table_to_df("12-10-0011-01")
df1 = df.loc[(df['REF_DATE'] >= reporting_year_period) & (df['Trade'] =='Trade Balance') &
(df['Principal trading partners'] == 'All countries'), ['REF_DATE','Trade','Principal trading partners','VALUE']]
df1[frequency_date_column] = df1['REF_DATE'].dt.to_period(frequency)
trade_balance_df = df1[[frequency_date_column, 'VALUE']]
trade_balance_df['Trade Balance Rate'] = trade_balance_df['VALUE'].pct_change() * 100
trade_balance_rate_df= trade_balance_df.groupby(frequency_date_column).mean()
#trade_balance_rate_df = trade_balance_rate_df.rename(columns={'VALUE': 'Unemployment rate'})
trade_balance_rate_df['Trade Balance Rate'] = round(trade_balance_rate_df['Trade Balance Rate'],1)
trade_balance_rate_df = trade_balance_rate_df[['Trade Balance Rate']]
trade_balance_rate_df = trade_balance_rate_df.dropna()
return trade_balance_rate_df
# --------------------------------------------------------------------------------------------
# unemployment rate: Labour force characteristics by province, monthly, seasonally adjusted
# --------------------------------------------------------------------------------------------
def get_unemployment_rate(reporting_year_period, frequency_date_column ):
frequency = frequency_date_column[0].upper()
df = sc.table_to_df("14-10-0287-03")
df1 = df.loc[(df['REF_DATE'] >= reporting_year_period) & (df['Labour force characteristics'] =='Unemployment rate') &
(df['UOM'] == 'Percentage'), ['REF_DATE','Labour force characteristics','UOM','VALUE']]
df1[frequency_date_column] = df1['REF_DATE'].dt.to_period(frequency)
unemployment_rate_df = df1[[frequency_date_column, 'VALUE']]
unemployment_rate_df= unemployment_rate_df.groupby(frequency_date_column).mean()
unemployment_rate_df = unemployment_rate_df.rename(columns={'VALUE': 'Unemployment rate'})
unemployment_rate_df['Unemployment rate'] = round(unemployment_rate_df['Unemployment rate'],1)
unemployment_rate_df = unemployment_rate_df.dropna()
return unemployment_rate_df
# --------------------------------------------------------------------------------------------------
# Financial market statistics, last Wednesday unless otherwise stated, Bank of Canada
# --------------------------------------------------------------------------------------------------
def get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, col_value_name, rate_statement, frequency_date_column ):
frequency = frequency_date_column[0].upper()
df = sc.table_to_df("10-10-0122-01")
df2 = df.loc[(df['REF_DATE'] >= reporting_year_period) & (df['UOM'] == 'Percent') & (df['Rates'].str.contains(rate_statement)),
['REF_DATE','Rates','UOM','VALUE']]
df2[frequency_date_column] = df2['REF_DATE'].dt.to_period(frequency)
df2 = df2.dropna()
goc_bonds_or_T_bill_df = df2[[frequency_date_column, 'VALUE']]
#goc_bonds_or_T_bill_df['VALUE'] = round(goc_bonds_or_T_bill_df['VALUE'],1)
goc_bonds_or_T_bill_df= goc_bonds_or_T_bill_df.groupby(frequency_date_column).mean()
goc_bonds_or_T_bill_df= goc_bonds_or_T_bill_df.rename(columns={'VALUE': col_value_name})
goc_bonds_or_T_bill_df[col_value_name] = round(goc_bonds_or_T_bill_df[col_value_name],1)
return goc_bonds_or_T_bill_df
# ---------------------------------------------------------------------------------------------------------------------
# CPI Inflaction:The CPI measures the average change over time in the prices paid by urban consumers
# for a market basket of consumer goods and services,
# and it's a key indicator of inflation (ING Think) (Inflation Calculator).
# ---------------------------------------------------------------------------------------------------------------------
def get_CPI_inflaction_rate(reporting_year_period, frequency_date_column):
frequency = frequency_date_column[0].upper()
alternative_measures = 'Measure of core inflation based on a factor model, CPI-common (year-over-year percent change)'
df = sc.table_to_df("18-10-0256-01")
df2 = df.loc[(df['REF_DATE'] >= reporting_year_period) & (df['UOM'] == 'Percent') &
(df['Alternative measures'] == alternative_measures),
['REF_DATE','Alternative measures','UOM','VALUE']]
df2[frequency_date_column] = df2['REF_DATE'].dt.to_period(frequency)
df2 = df2.dropna()
CPI_inflaction_rate_df = df2[[frequency_date_column, 'VALUE']]
CPI_inflaction_rate_df= CPI_inflaction_rate_df.groupby(frequency_date_column).mean()
CPI_inflaction_rate_df= CPI_inflaction_rate_df.rename(columns={'VALUE': 'CPI Inflaction Rate'})
CPI_inflaction_rate_df['CPI Inflaction Rate'] = round(CPI_inflaction_rate_df['CPI Inflaction Rate'],1)
return CPI_inflaction_rate_df
# -----------------------------------------------------------------------------------
#morgage rate
# -----------------------------------------------------------------------------------
def get_morgage_rate(reporting_year_period, frequency_date_column):
frequency = frequency_date_column[0].upper()
df = sc.table_to_df("34-10-0145-01")
df1 = df.loc[(df['REF_DATE'] >= reporting_year_period), ['REF_DATE', 'UOM','VALUE']]
df1[frequency_date_column] = df1['REF_DATE'].dt.to_period(frequency)
get_morgage_rate_df = df1[[frequency_date_column, 'VALUE']]
get_morgage_rate_df = get_morgage_rate_df.groupby(frequency_date_column).mean()
get_morgage_rate_df = get_morgage_rate_df.rename(columns={'VALUE': 'Morgage Rate'})
get_morgage_rate_df['Morgage Rate'] = round(get_morgage_rate_df['Morgage Rate'],1)
get_morgage_rate_df = get_morgage_rate_df.dropna()
return get_morgage_rate_df
# -------------------------------------------------------------------------------------------------------------------------------------
# prime rate
# The prime interest rate is the percentage that U.S. commercial banks charge their most creditworthy customers for loans.
# Like all loan rates, the prime interest rate is derived from the federal funds' overnight rate, set by the Federal Reserve at
# meetings held eight times a year. The prime interest rate is the benchmark banks and other lenders
# use when setting their interest rates for every category of loan from credit cards to car loans and mortgages.
# -------------------------------------------------------------------------------------------------------------------------------------
def get_prime_rate(reporting_year_period, frequency_date_column):
frequency = frequency_date_column[0].upper()
df = sc.table_to_df("10-10-0145-01")
df1 = df.loc[(df['REF_DATE'] >= reporting_year_period), ['REF_DATE', 'UOM','VALUE']]
df1[frequency_date_column] = df1['REF_DATE'].dt.to_period(frequency)
get_prime_rate_df = df1[[frequency_date_column, 'VALUE']]
get_prime_rate_df.set_index(frequency_date_column, inplace=True)
get_prime_rate_df = get_prime_rate_df.groupby(frequency_date_column).mean()
get_prime_rate_df = get_prime_rate_df.rename(columns={'VALUE': 'Prime Rate'})
get_prime_rate_df['Prime Rate'] = round(get_prime_rate_df['Prime Rate'],1)
get_prime_rate_df = get_prime_rate_df.dropna()
return get_prime_rate_df
# ----------------------------------------------------------------------------------------------------
# House Price Index (house and land)
# ----------------------------------------------------------------------------------------------------
def get_house_price_index(reporting_year_period, frequency_date_column):
frequency = frequency_date_column[0].upper()
df = sc.table_to_df("18-10-0205-02")
df1 = df.loc[(df['REF_DATE'] >= reporting_year_period) & (df['GEO'] =='Canada') &
(df['New housing price indexes'] =='Total (house and land)')
, ['REF_DATE','New housing price indexes', 'VALUE']]
df1[frequency_date_column] = df1['REF_DATE'].dt.to_period(frequency)
get_house_price_index_df = df1[[frequency_date_column, 'VALUE']]
get_house_price_index_df.set_index(frequency_date_column, inplace=True)
get_house_price_index_df = ((get_house_price_index_df / get_house_price_index_df.shift(1)) - 1)*100
get_house_price_index_df= get_house_price_index_df.groupby(frequency_date_column).mean()
get_house_price_index_df = get_house_price_index_df.rename(columns={'VALUE': 'House Price Index(house and land)'})
get_house_price_index_df['House Price Index(house and land)'] = round(get_house_price_index_df[
'House Price Index(house and land)'],1)
get_house_price_index_df = get_house_price_index_df.dropna()
return get_house_price_index_df.tail(60)
# -----------------------------------------------------------------------------------------
# Real GDP growth Seasonal adjustment
# -----------------------------------------------------------------------------------------
def get_Real_GDP_growth(reporting_year_period, frequency_date_column):
frequency = frequency_date_column[0].upper()
df = sc.table_to_df("36-10-0434-02")
df1 = df.loc[(df['REF_DATE'] >= reporting_year_period) & (df['GEO'] =='Canada') &
(df['North American Industry Classification System (NAICS)'] =='All industries [T001]'),
['REF_DATE','Seasonal adjustment', 'VALUE']]
df1[frequency_date_column] = df1['REF_DATE'].dt.to_period(frequency)
get_Real_GDP_growth_df = df1[[frequency_date_column, 'VALUE']]
get_Real_GDP_growth_df.set_index(frequency_date_column, inplace=True)
#get_Real_GDP_growth_df= get_Real_GDP_growth_df.groupby('MONTH_YEAR').sum()
get_Real_GDP_growth_df= get_Real_GDP_growth_df.groupby(frequency_date_column).mean()
get_Real_GDP_growth_df = ((get_Real_GDP_growth_df / get_Real_GDP_growth_df.shift(1)) - 1)*100
get_Real_GDP_growth_df = get_Real_GDP_growth_df.rename(columns={'VALUE': 'Real GDP growth Seasonal adjustment'})
get_Real_GDP_growth_df['Real GDP growth Seasonal adjustment'] = round(get_Real_GDP_growth_df[
'Real GDP growth Seasonal adjustment'],1)
get_Real_GDP_growth_df = get_Real_GDP_growth_df.dropna()
return get_Real_GDP_growth_df.tail(60)
# ------------------------------------------------------------------------------------------------------
# Marcket Valatility
# oronto Stock Exchange statistics1: S&P/TSX 60 VIX Index (VIXI.TS)
# The S&P/TSX 60 is a market-capitalization-weighted index that tracks the performance of the 60 largest
# companies listed on the Toronto Stock Exchange (TSX). The S&P/TSX Composite, on the other hand,
# is a broader index that includes all common stocks and income trust units listed on the TSX
# The S&P/TSX Composite provides a more comprehensive view of the Canadian stock market.
# 3It includes a wider range of companies, from small-cap to large-cap. This makes it a good
# choice for investors who want to diversify their portfolio across different sectors and market capitalizations.
# ://www.spglobal.com/spdji/en/indices/equity/sp-tsx-composite-index/#overview
# Toronto Stock Exchange statisticand :S&P/TSX 60 VIX Index (VIXI.TS),
# S&P/TSX Venture Composite Index (^SPCDNX) and S&P/TSX Composite index (^GSPTSE)
# The S&P 500 index, or Standard & Poor’s 500, is a very important index that tracks
# the performance of the stocks of 500 large-cap companies in the U.S. The ticker symbol for the S&P 500 index is ^GSPC.
# The DJIA tracks the stock prices of 30 of the biggest American companies.
# The S&P 500 tracks 500 large-cap American stocks. Both offer a big-picture view of the state of the
# stock markets in general
# https://www.investopedia.com/ask/answers/difference-between-dow-jones-industrial-average-and-sp-500/#:
# ~:text=Key%20Takeaways,the%20stock%20markets%20in%20general.
# ---------------------------------------------------------------------------------------------------------------
def get_market_index_volatility(reporting_year_period, frequency_date_column, market_index_list = ['^GSPTSE', '^GSPC', '^DJI']):
frequency = frequency_date_column[0].upper()
start_date = reporting_year_period
end_date = date.today()
#index_yahoo_adj_close_price_data = yf.download(market_index_list, start_date, end_date, ['Adj Close'], period ='max')
#market_adj_close_price_df = index_yahoo_adj_close_price_data['Adj Close']
market_adj_close_price_df = create_adj_close_price_df(reporting_year_period, market_index_list)
market_adj_close_price_log_return_df = np.log(market_adj_close_price_df/ market_adj_close_price_df.shift(1))
# drop columns with all NaN's
market_adj_close_price_log_return_df = market_adj_close_price_log_return_df.dropna(axis=0)
#Market volatility
market_volatility_df = market_adj_close_price_log_return_df.rolling(center=False,window= 252).std() * np.sqrt(252)
for col in list(market_volatility_df.columns):
market_volatility_df = market_volatility_df.rename(columns={col: 'Market '+col+' Volatility Index'})
market_volatility_df = market_volatility_df.dropna(axis=0)
market_volatility_df[frequency_date_column] = pd.to_datetime(market_volatility_df.index, format = '%m/%Y')
market_volatility_df[frequency_date_column] = market_volatility_df[frequency_date_column].dt.to_period(frequency)
#market_adj_close_price_log_return_frequency_df = market_volatility_df
market_volatility_df.set_index(frequency_date_column, inplace=True)
market_volatility_index_df = market_volatility_df.groupby(frequency_date_column).mean()
market_volatility_index_df = round(market_volatility_index_df,1)
market_volatility_index_df = market_volatility_index_df.dropna(axis=0)
return market_volatility_index_df
#if frequency == 'M' :
# return market_volatility_index_df.tail(60)
#else:
# return market_volatility_index_df.tail(20)
#-------------------------------------------------------Governement of Canada Bonds average----------------------------------------------
def goc_bonds_average(reporting_year_period, frequency_date_column):
goc_bonds_average_yield_1_3_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC Marketable Bonds Average Yield: 1-3 year',
'Government of Canada marketable bonds, average yield: 1-3 year', frequency_date_column)
goc_bonds_average_yield_5_10_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC Marketable Bonds Average Yield: 5-10 year',
'Government of Canada marketable bonds, average yield: 5-10 year', frequency_date_column)
goc_bonds_average_yield_3_5_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC Marketable Bonds Average Yield: 3-5 year',
'Government of Canada marketable bonds, average yield: 3-5 year', frequency_date_column)
goc_bonds_average_yield_over_10_years_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period,
'GOC Marketable Bonds Average Yield: over 10 years',
'Government of Canada marketable bonds, average yield: over 10 years', frequency_date_column)
goc_bonds_average_df = goc_bonds_average_yield_1_3_df.merge(goc_bonds_average_yield_5_10_df,
on= frequency_date_column, how='inner') \
.merge(goc_bonds_average_yield_3_5_df, on= frequency_date_column, how='inner') \
.merge(goc_bonds_average_yield_over_10_years_df, on= frequency_date_column, how='inner')
return goc_bonds_average_df
#------------------------- Governement of Canada Benchmark Bonds Yield -------------------------------------------------------------------
def goc_benchmark_bonds_yield(reporting_year_period, frequency_date_column):
goc_benchmark_bonds_yield_over_2_year_df = \
get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC benchmark bond yields: 2 year',
'Selected Government of Canada benchmark bond yields: 2 year' , frequency_date_column)
goc_benchmark_bonds_yield_over_3_year_df = \
get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC benchmark bond yields: 3 year',
'Selected Government of Canada benchmark bond yields: 3 year', frequency_date_column)
goc_benchmark_bonds_yield_over_5_year_df = \
get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC benchmark bond yields: 5 year',
'Selected Government of Canada benchmark bond yields: 5 year', frequency_date_column)
goc_benchmark_bonds_yield_over_7_year_df = \
get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC benchmark bond yields: 7 year',
'Selected Government of Canada benchmark bond yields: 7 year', frequency_date_column)
goc_benchmark_bonds_yield_over_10_years_df = \
get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC benchmark bond yields: 10 years',
'Selected Government of Canada benchmark bond yields: 10 years', frequency_date_column)
goc_benchmark_bonds_yield_over_long_term_df = \
get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'GOC benchmark bond yields: long term',
'Selected Government of Canada benchmark bond yields: long term', frequency_date_column)
goc_benchmark_bonds_yield_df = \
goc_benchmark_bonds_yield_over_2_year_df.merge(goc_benchmark_bonds_yield_over_3_year_df,
on= frequency_date_column, how='inner') \
.merge(goc_benchmark_bonds_yield_over_5_year_df, on= frequency_date_column, how='inner') \
.merge(goc_benchmark_bonds_yield_over_7_year_df, on= frequency_date_column, how='inner') \
.merge(goc_benchmark_bonds_yield_over_10_years_df, on= frequency_date_column, how='inner') \
.merge(goc_benchmark_bonds_yield_over_long_term_df, on= frequency_date_column, how='inner')
return goc_benchmark_bonds_yield_df
#------------------------------------------------------------Governement of Canada Treasurt Bills --------------------------------------------
def Treasury_bills(reporting_year_period, frequency_date_column):
Treasury_bills_1_month_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'Treasury bills: 1 month',
'Treasury bills: 1 month', frequency_date_column)
Treasury_bills_2_month_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'Treasury bills: 2 month',
'Treasury bills: 2 month', frequency_date_column)
Treasury_bills_3_month_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'Treasury bills: 3 month',
'Treasury bills: 3 month', frequency_date_column)
Treasury_bills_6_month_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'Treasury bills: 6 month',
'Treasury bills: 6 month', frequency_date_column)
Treasury_bills_1_year_df = get_Government_of_Canada_bonds_or_T_bill(reporting_year_period, 'Treasury bills: 1 year',
'Treasury bills: 1 year', frequency_date_column)
Treasury_bills_df = Treasury_bills_1_month_df.merge(Treasury_bills_2_month_df, on= frequency_date_column, how='inner') \
.merge(Treasury_bills_3_month_df, on= frequency_date_column, how='inner') \
.merge(Treasury_bills_6_month_df, on= frequency_date_column, how='inner') \
.merge(Treasury_bills_1_year_df, on=frequency_date_column, how='inner')
return Treasury_bills_df
#----------------------------------------- Other Economic Factors ------------------------------------------------------------------
def other_economic_factors(reporting_year_period, frequency_date_column):
unemployment_rate_df = get_unemployment_rate(reporting_year_period, frequency_date_column)
CPI_inflaction_rate_df = get_CPI_inflaction_rate(reporting_year_period, frequency_date_column)
get_morgage_rate_df = get_morgage_rate(reporting_year_period, frequency_date_column)
get_prime_rate_df = get_prime_rate(reporting_year_period, frequency_date_column)
get_house_price_index_df = get_house_price_index(reporting_year_period, frequency_date_column)
get_Real_GDP_growth_df = get_Real_GDP_growth(reporting_year_period, frequency_date_column)
market_index_volatility_df = get_market_index_volatility(reporting_year_period, frequency_date_column)
trade_balance_rate_df = get_trade_balance_rate(reporting_year_period, frequency_date_column)
other_economic_factors_df = CPI_inflaction_rate_df.merge(get_morgage_rate_df, on= frequency_date_column, how='inner') \
.merge(get_prime_rate_df, on= frequency_date_column, how='inner') \
.merge(get_house_price_index_df, on= frequency_date_column, how='inner') \
.merge(unemployment_rate_df, on= frequency_date_column, how='inner') \
.merge(get_Real_GDP_growth_df, on= frequency_date_column, how='inner') \
.merge(market_index_volatility_df, on= frequency_date_column, how='inner')
return other_economic_factors_df
#-----------------------------------------------------------All the Economic Factors -----------------------------------------
def get_economic_factors_df(reporting_year_period, reporting_frequency):
#set reporting frequency
if reporting_frequency.capitalize() == 'Month' or reporting_frequency.capitalize() == 'Quarter':
frequency_date_column = reporting_frequency.capitalize() + '_Year'
#frequency = reporting_frequency[0].upper()
goc_bonds_average_df = goc_bonds_average(reporting_year_period, frequency_date_column)
goc_benchmark_bonds_yield_df = goc_benchmark_bonds_yield(reporting_year_period, frequency_date_column)
Treasury_bills_df = Treasury_bills(reporting_year_period, frequency_date_column)
other_economic_factors_df = other_economic_factors(reporting_year_period, frequency_date_column)
economic_factors_df = goc_bonds_average_df.merge(goc_benchmark_bonds_yield_df, on= frequency_date_column, how='inner') \
.merge(Treasury_bills_df, on= frequency_date_column, how='inner') \
.merge(other_economic_factors_df, on= frequency_date_column, how='inner')
return economic_factors_df
else:
return 'The reporting frequency should be alphanbetic, Month or Qurater'
#-------------------------------------------------------------Macroeconomics factors Plotting---------------------------------------
def annotate_bars(ax):# this function is generated by ChatGPT
for p in ax.patches:
width, height = p.get_width(), p.get_height()
x, y = p.get_xy()
ax.annotate(f'{height:.1f}', (x + width/2, y + height/2), ha='center', va='center', fontsize=10, color='black')
def get_economic_factors_barplotting(goc_bonds_average_df, goc_benchmark_bonds_yield_df,Treasury_bills_df, other_economic_factors_df ):
fig, axes =plt.subplots(4,1,figsize=(20, 35), constrained_layout=True)
bar_width = 0.7
bar0 = goc_bonds_average_df.plot(kind='bar', width=bar_width, stacked=True, ax = axes[0])
bar0.set_title('Governement of Canada Bonds Average',color='black')
bar0.legend(loc='best')
annotate_bars(axes[0])
bar1 = goc_benchmark_bonds_yield_df.plot(kind='bar', width=bar_width, stacked=True, ax = axes[1])
bar1.set_title('Governement of Canada Benchmark Bonds Yield',color='black')
bar1.legend(loc='best')
annotate_bars(axes[1])
bar2 = Treasury_bills_df.plot(kind='bar', width=bar_width, stacked=True, ax = axes[2])
bar2.set_title("Governement of Canada Treasury Bills",color='black')
bar2.legend(loc='best')
annotate_bars(axes[2])
bar3 = other_economic_factors_df.plot(kind='bar', width=bar_width, stacked=True, ax = axes[3])
bar3.set_title('Governement of Canada Other Economic Factirs',color='black')
bar3.legend(loc='best')
annotate_bars(axes[3])
#----------------------------Principal Components Analysis(PCA) to select most importance economic factors ---------------------------------
def selecting_importent_economic_factors_treshold_method_PCA(df,threshold):
return df[(df.abs() > threshold).any(axis=1)].index.to_list()
def setting_PCA_for_economic_factors(economic_factors_df):
# economic indicators dataset
# economic_factors_df = get_economic_factors_df(reporting_year_period, reporting_frequency)
# Standardizing the data
scaler = StandardScaler()
scaled_data_df = scaler.fit_transform(economic_factors_df)
# Applying PCA
all_pca = PCA(n_components=None) # Use all components to find the best number of important indicators
all_principal_components = all_pca.fit_transform(scaled_data_df)
# Explained variance
explained_variance = all_pca.explained_variance_ratio_
# Principal Component Loadings(coefficients)
loadings_matrix = all_pca.components_
# Create a DataFrame for loadings
loadings_matrix_df = pd.DataFrame(loadings_matrix.T, columns=[f'PC{i+1}' for i in range(loadings_matrix.shape[0])],
index=economic_factors_df.columns)
return loadings_matrix_df, explained_variance
def get_num_components(explained_variance,cumulative_variance_treshold = 0.9):
# Determine the number of components explaining the cumulative varience treshold of the variance
cumulative_variance = explained_variance.cumsum()
return (cumulative_variance <= cumulative_variance_treshold).sum() + 1
def select_top_components_df(loadings_matrix_df, num_components, threshold_for_high_loadings = 0.5):
# Select top components
return loadings_matrix_df.iloc[:, :num_components]
def select_top_indicators_df(loadings_matrix_df, num_components, threshold_for_high_loadings = 0.5):
# Select top components
selected_components_df = loadings_matrix_df.iloc[:, :num_components]
# Find indicators with high loadings
return selected_components_df[(selected_components_df.abs() > threshold_for_high_loadings).any(axis=1)]
def plot_explained_variance_(economic_factors_df):
loadings_matrix_df, explained_variance = setting_PCA_for_economic_factors(economic_factors_df)
# Print explained variance
explained_variance_df = pd.DataFrame(explained_variance).T
explained_variance_df.columns = loadings_matrix_df.columns
display(explained_variance_df)
# Plotting the explained variance
plt.figure(figsize=(10, 6))
plt.bar(range(1, len(explained_variance) + 1), explained_variance, alpha=0.5, align='center', label='individual explained variance')
plt.step(range(1, len(explained_variance) + 1), np.cumsum(explained_variance), where='mid', label='cumulative explained variance')
plt.xlabel('Principal Components')
plt.ylabel('Explained Variance Ratio')
plt.title('Explained Variance by Principal Components')
plt.legend(loc='best')
plt.show()
def plotting_corr_matrix(economic_factors_matrix, title):
g = sns.clustermap(economic_factors_matrix , method = 'complete', cmap = 'RdBu', annot = True, annot_kws = {'size': 15},figsize=(20, 15),
cbar_kws={"shrink": 0.6, "aspect": 15})
plt.subplots_adjust(top=0.85)
plt.setp(g.ax_heatmap.get_xticklabels(), rotation=90)
plt.setp(g.ax_heatmap.get_yticklabels(), rotation=360)
g.cax.set_position([1.02, 0.3, 0.03, 0.4]) # [left, bottom, width, height]
g.cax.set_ylabel('Correlation Coefficient', rotation=270, labelpad=15) # Rotate label
g.fig.suptitle(title, y=0.9, fontsize=12)
def get_most_important_economic_factors_list(economic_factors_df,
cumulative_variance_treshold = 1, threshold_for_highest_loadings = 0.5):
#plot_explained_variance_(reporting_year_period, reporting_frequency)
#plot_explained_variance_(economic_factors_df)
loadings_matrix_df, explained_variance = setting_PCA_for_economic_factors(economic_factors_df)
#print('\nloadings_matrix_df\n')
#display(loadings_matrix_df)
num_components = get_num_components(explained_variance,cumulative_variance_treshold)
top_components_df = select_top_components_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
#print('\ntop_components_df\n')
#display(top_components_df)
#print('\ntop_indicators_df\n')
top_indicators_df = select_top_indicators_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
#display(top_indicators_df)
most_important_economic_factors_list = selecting_importent_economic_factors_treshold_method_PCA(top_indicators_df,
threshold_for_highest_loadings)
return most_important_economic_factors_list
def plotting_most_important_economic_factors_list(economic_factors_df,
cumulative_variance_treshold = 1, threshold_for_highest_loadings = 0.5):
#plot_explained_variance_(reporting_year_period, reporting_frequency)
plot_explained_variance_(economic_factors_df)
loadings_matrix_df, explained_variance = setting_PCA_for_economic_factors(economic_factors_df)
print('\nloadings_matrix_df\n')
display(loadings_matrix_df)
num_components = get_num_components(explained_variance,cumulative_variance_treshold)
top_components_df = select_top_components_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
print('\ntop_components_df\n')
display(top_components_df)
print('\ntop_indicators_df\n')
top_indicators_df = select_top_indicators_df(loadings_matrix_df, num_components, threshold_for_highest_loadings)
display(top_indicators_df)
most_important_economic_factors_list = selecting_importent_economic_factors_treshold_method_PCA(top_indicators_df,
threshold_for_highest_loadings)
def get_most_important_economic_factors_df(economic_factors_df, most_important_economic_factors_list):
return economic_factors_df[most_important_economic_factors_list]
def get_most_important_economic_factors_matrix(most_important_economic_factors_df):
return generate_correlation_matrix(most_important_economic_factors_df)
def plotting_most_important_economic_factors_corr_clustermap(most_important_economic_factors_matrix):
#PCA couple with covarience matrice to select most important factors
plotting_corr_matrix(most_important_economic_factors_matrix,'Most Important Economic Factors Correlation Matrix Cluster Map using PCA')
#----------------------------------------------------------------------Main Data Setting-------------------------------------------
reporting_year_period = start_date(365*5)
reporting_frequency = 'Quarter'
cumulative_variance_treshold = 1.0
threshold_for_highest_loadings = 0.5
correlation_coefficient_treshold = 0.3
#Economic Factors Data Frames
goc_bonds_average_df = goc_bonds_average(reporting_year_period, reporting_frequency)
goc_benchmark_bonds_yield_df = goc_benchmark_bonds_yield(reporting_year_period, reporting_frequency)
Treasury_bills_df = Treasury_bills(reporting_year_period, reporting_frequency)
other_economic_factors_df = other_economic_factors(reporting_year_period, reporting_frequency)
trade_balance_rate_df = get_trade_balance_rate(reporting_year_period, reporting_frequency)
economic_factors_df = get_economic_factors_df(reporting_year_period, reporting_frequency)
#All the economic factors correlation matrice
economic_factors_matrix = generate_correlation_matrix(economic_factors_df)
#Principal Components Analysis(PCA) to select Most Important Economic Factors
most_important_economic_factors_list = get_most_important_economic_factors_list(economic_factors_df, cumulative_variance_treshold,
threshold_for_highest_loadings)
most_important_economic_factors_df = get_most_important_economic_factors_df(economic_factors_df, most_important_economic_factors_list)
most_important_economic_factors_matrix = get_most_important_economic_factors_matrix(most_important_economic_factors_df)
#-------------------------------------------------Data Visualization------------------------------------------------------------------
def print_economic_factors_data_table():
print('\n **********************************************************\n'+
' All the Economic Factors Data Tables\n'+
' *********************************************************\n')
display(economic_factors_df)
get_economic_factors_barplotting(goc_bonds_average_df, goc_benchmark_bonds_yield_df,Treasury_bills_df, other_economic_factors_df )
def print_economic_factors_data_corr_matrix():
print('\n **********************************************************\n'+
' All the Economic Factors Correlation Matrix\n'+
' *********************************************************\n')
display(economic_factors_matrix)
plotting_corr_matrix(economic_factors_matrix, 'All the economic factors')
#plotting_selected_assets_corr_mat_clustermap(economic_factors_matrix, 'All the economic factors')
def print_most_important_economic_factors():
print('\n *****************************************************************************************\n'+
' Principal Components Analysis(PCA) to select Most Important Economic Factors \n'+
' ****************************************************************************************\n')
print('Principal Components Analysis(PCA) to select Most Important Economic Factors \n')
plotting_most_important_economic_factors_list(economic_factors_df, cumulative_variance_treshold, threshold_for_highest_loadings)
print('\n most_important_economic_factors_df\n')
display(most_important_economic_factors_df)
print('\n most_important_economic_factors_matrix\n')
display(most_important_economic_factors_matrix)
plotting_corr_matrix(most_important_economic_factors_matrix, 'Most Important Economic Factors correlation Matrix - PCA Method')
#plotting_selected_assets_corr_mat_clustermap(most_important_economic_factors_matrix, 'Most Important Economic Factors correlation Matrix - PCA Method',
# dendrogram = True)
[*********************100%%**********************] 3 of 3 completed [*********************100%%**********************] 3 of 3 completed
print_economic_factors_data_table()
**********************************************************
All the Economic Factors Data Tables
*********************************************************
| GOC Marketable Bonds Average Yield: 1-3 year | GOC Marketable Bonds Average Yield: 5-10 year | GOC Marketable Bonds Average Yield: 3-5 year | GOC Marketable Bonds Average Yield: over 10 years | GOC benchmark bond yields: 2 year | GOC benchmark bond yields: 3 year | GOC benchmark bond yields: 5 year | GOC benchmark bond yields: 7 year | GOC benchmark bond yields: 10 years | GOC benchmark bond yields: long term | ... | Treasury bills: 2 month | Treasury bills: 3 month | Treasury bills: 6 month | Treasury bills: 1 year | CPI Inflaction Rate | Morgage Rate | Prime Rate | House Price Index(house and land) | Unemployment rate | Real GDP growth Seasonal adjustment | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Quarter_Year | |||||||||||||||||||||
| 2020Q1 | 1.2 | 1.1 | 1.1 | 1.3 | 1.1 | 1.1 | 1.1 | 1.1 | 1.1 | 1.4 | ... | 1.3 | 1.2 | 1.2 | 1.2 | 2.0 | 4.0 | 1.8 | 0.2 | 4.6 | -2.1 |
| 2020Q2 | 0.3 | 0.5 | 0.4 | 1.0 | 0.3 | 0.3 | 0.4 | 0.4 | 0.5 | 1.1 | ... | 0.2 | 0.2 | 0.3 | 0.3 | 1.6 | 3.9 | 1.2 | 0.1 | 7.8 | -10.6 |
| 2020Q3 | 0.2 | 0.5 | 0.3 | 1.0 | 0.3 | 0.3 | 0.4 | 0.4 | 0.6 | 1.1 | ... | 0.2 | 0.1 | 0.2 | 0.2 | 1.4 | 3.6 | 1.1 | 0.7 | 5.9 | 8.9 |
| 2020Q4 | 0.2 | 0.6 | 0.4 | 1.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.7 | 1.2 | ... | 0.1 | 0.1 | 0.1 | 0.2 | 1.7 | 3.4 | 1.0 | 0.5 | 5.2 | 2.1 |
| 2021Q1 | 0.2 | 1.0 | 0.5 | 1.7 | 0.2 | 0.3 | 0.7 | 0.9 | 1.2 | 1.8 | ... | 0.1 | 0.1 | 0.1 | 0.1 | 1.7 | 3.3 | 1.1 | 1.2 | 5.5 | 1.2 |
| 2021Q2 | 0.3 | 1.3 | 0.8 | 1.9 | 0.4 | 0.5 | 0.9 | 1.2 | 1.5 | 2.0 | ... | 0.1 | 0.1 | 0.2 | 0.2 | 2.4 | 3.3 | 1.2 | 1.3 | 5.1 | -0.1 |
| 2021Q3 | 0.4 | 1.2 | 0.8 | 1.8 | 0.5 | 0.6 | 0.9 | 1.1 | 1.3 | 1.8 | ... | 0.2 | 0.2 | 0.2 | 0.3 | 2.9 | 3.2 | 1.2 | 0.5 | 4.6 | 1.6 |
| 2021Q4 | 1.0 | 1.5 | 1.3 | 1.9 | 1.0 | 1.1 | 1.4 | 1.5 | 1.6 | 1.9 | ... | 0.1 | 0.1 | 0.3 | 0.7 | 3.1 | 3.4 | 1.3 | 0.6 | 4.1 | 1.6 |
| 2022Q1 | 1.6 | 2.0 | 1.9 | 2.3 | 1.7 | 1.8 | 2.0 | 2.0 | 2.1 | 2.2 | ... | 0.3 | 0.4 | 0.9 | 1.4 | 4.0 | 3.6 | 1.6 | 1.1 | 4.3 | 0.8 |
| 2022Q2 | 2.7 | 2.9 | 2.8 | 3.0 | 2.7 | 2.8 | 2.8 | 2.8 | 2.9 | 2.9 | ... | 1.4 | 1.6 | 2.1 | 2.6 | 5.3 | 4.6 | 2.5 | 0.3 | 3.7 | 1.1 |
| 2022Q3 | 3.5 | 3.0 | 3.2 | 3.0 | 3.5 | 3.4 | 3.1 | 3.0 | 3.0 | 2.9 | ... | 2.9 | 3.1 | 3.4 | 3.7 | 5.8 | 5.6 | 3.2 | 0.0 | 3.6 | 0.5 |
| 2022Q4 | 3.9 | 3.2 | 3.5 | 3.3 | 3.9 | 3.7 | 3.3 | 3.1 | 3.2 | 3.2 | ... | 3.9 | 4.1 | 4.2 | 4.4 | 6.0 | 5.8 | 3.7 | -0.1 | 3.5 | -0.0 |
| 2023Q1 | 3.9 | 3.0 | 3.3 | 3.1 | 3.8 | 3.6 | 3.2 | 3.0 | 3.0 | 3.1 | ... | 4.4 | 4.4 | 4.4 | 4.4 | 5.9 | 5.8 | 3.9 | -0.2 | 3.8 | 0.6 |
| 2023Q2 | 4.1 | 3.1 | 3.4 | 3.1 | 4.1 | 3.8 | 3.3 | 3.1 | 3.1 | 3.1 | ... | 4.6 | 4.6 | 4.7 | 4.7 | 5.3 | 5.8 | 4.0 | 0.0 | 3.7 | 0.2 |
| 2023Q3 | 4.8 | 3.8 | 4.1 | 3.6 | 4.8 | 4.5 | 4.0 | 3.8 | 3.7 | 3.5 | ... | 5.0 | 5.0 | 5.1 | 5.2 | 4.6 | 6.1 | 4.3 | -0.1 | 3.7 | -0.1 |
| 2023Q4 | 4.3 | 3.6 | 3.7 | 3.4 | 4.3 | 4.1 | 3.7 | 3.6 | 3.6 | 3.4 | ... | 5.0 | 5.0 | 5.0 | 4.8 | 4.0 | 6.4 | 4.3 | -0.1 | 3.6 | 0.1 |
| 2024Q1 | 4.2 | 3.4 | 3.6 | 3.4 | 4.1 | 3.9 | 3.5 | 3.4 | 3.4 | 3.3 | ... | 5.0 | 5.0 | 4.9 | 4.8 | 3.1 | 6.2 | 4.1 | 0.0 | 4.0 | 0.5 |
| 2024Q2 | 4.3 | 3.7 | 3.8 | 3.6 | 4.2 | 4.1 | 3.7 | 3.7 | 3.7 | 3.6 | ... | 4.8 | 4.8 | 4.8 | 4.6 | 2.5 | 6.1 | 4.1 | 0.2 | 4.0 | 0.4 |
18 rows × 21 columns
print_economic_factors_data_corr_matrix()
**********************************************************
All the Economic Factors Correlation Matrix
*********************************************************
| GOC Marketable Bonds Average Yield: 1-3 year | GOC Marketable Bonds Average Yield: 5-10 year | GOC Marketable Bonds Average Yield: 3-5 year | GOC Marketable Bonds Average Yield: over 10 years | GOC benchmark bond yields: 2 year | GOC benchmark bond yields: 3 year | GOC benchmark bond yields: 5 year | GOC benchmark bond yields: 7 year | GOC benchmark bond yields: 10 years | GOC benchmark bond yields: long term | ... | Treasury bills: 2 month | Treasury bills: 3 month | Treasury bills: 6 month | Treasury bills: 1 year | CPI Inflaction Rate | Morgage Rate | Prime Rate | House Price Index(house and land) | Unemployment rate | Real GDP growth Seasonal adjustment | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GOC Marketable Bonds Average Yield: 1-3 year | 1.000000 | 0.976199 | 0.993209 | 0.951231 | 0.999462 | 0.998315 | 0.989338 | 0.977915 | 0.967320 | 0.952575 | ... | 0.966132 | 0.973636 | 0.987148 | 0.996983 | 0.725500 | 0.972380 | 0.989560 | -0.744187 | -0.746375 | -0.031706 |
| GOC Marketable Bonds Average Yield: 5-10 year | 0.976199 | 1.000000 | 0.992777 | 0.992448 | 0.978538 | 0.985369 | 0.995928 | 0.999459 | 0.998761 | 0.991043 | ... | 0.913679 | 0.925168 | 0.945192 | 0.962425 | 0.739110 | 0.920224 | 0.952833 | -0.625150 | -0.802838 | 0.009622 |
| GOC Marketable Bonds Average Yield: 3-5 year | 0.993209 | 0.992777 | 1.000000 | 0.975615 | 0.994331 | 0.997760 | 0.998818 | 0.993509 | 0.987007 | 0.973534 | ... | 0.936388 | 0.947076 | 0.966219 | 0.983328 | 0.756007 | 0.946726 | 0.971334 | -0.693431 | -0.785146 | -0.017584 |
| GOC Marketable Bonds Average Yield: over 10 years | 0.951231 | 0.992448 | 0.975615 | 1.000000 | 0.954739 | 0.963174 | 0.980846 | 0.989952 | 0.995923 | 0.998236 | ... | 0.885694 | 0.898744 | 0.918627 | 0.937023 | 0.746162 | 0.886569 | 0.927391 | -0.561822 | -0.805263 | 0.029892 |
| GOC benchmark bond yields: 2 year | 0.999462 | 0.978538 | 0.994331 | 0.954739 | 1.000000 | 0.998947 | 0.990991 | 0.979915 | 0.970160 | 0.955397 | ... | 0.962590 | 0.970533 | 0.985163 | 0.996038 | 0.734083 | 0.969309 | 0.987723 | -0.735073 | -0.750504 | -0.021169 |
| GOC benchmark bond yields: 3 year | 0.998315 | 0.985369 | 0.997760 | 0.963174 | 0.998947 | 1.000000 | 0.995408 | 0.986585 | 0.977878 | 0.962796 | ... | 0.953059 | 0.962051 | 0.978696 | 0.992004 | 0.740256 | 0.962145 | 0.982265 | -0.720636 | -0.765741 | -0.014608 |
| GOC benchmark bond yields: 5 year | 0.989338 | 0.995928 | 0.998818 | 0.980846 | 0.990991 | 0.995408 | 1.000000 | 0.996909 | 0.991605 | 0.978755 | ... | 0.929439 | 0.940004 | 0.960019 | 0.977644 | 0.751887 | 0.937894 | 0.966160 | -0.668349 | -0.793680 | -0.001892 |
| GOC benchmark bond yields: 7 year | 0.977915 | 0.999459 | 0.993509 | 0.989952 | 0.979915 | 0.986585 | 0.996909 | 1.000000 | 0.997936 | 0.988589 | ... | 0.915759 | 0.926639 | 0.946566 | 0.963724 | 0.735470 | 0.920864 | 0.954083 | -0.626895 | -0.806113 | 0.007730 |
| GOC benchmark bond yields: 10 years | 0.967320 | 0.998761 | 0.987007 | 0.995923 | 0.970160 | 0.977878 | 0.991605 | 0.997936 | 1.000000 | 0.994981 | ... | 0.904050 | 0.915834 | 0.935894 | 0.953085 | 0.736259 | 0.907991 | 0.943930 | -0.591373 | -0.811841 | 0.030765 |
| GOC benchmark bond yields: long term | 0.952575 | 0.991043 | 0.973534 | 0.998236 | 0.955397 | 0.962796 | 0.978755 | 0.988589 | 0.994981 | 1.000000 | ... | 0.898326 | 0.909881 | 0.926861 | 0.940557 | 0.726525 | 0.894927 | 0.935540 | -0.562839 | -0.794732 | 0.025609 |
| Treasury bills: 1 month | 0.958881 | 0.903818 | 0.927301 | 0.876058 | 0.955022 | 0.944663 | 0.919827 | 0.906120 | 0.894311 | 0.889460 | ... | 0.999360 | 0.997743 | 0.990470 | 0.974308 | 0.569049 | 0.978951 | 0.987231 | -0.765780 | -0.633458 | -0.047291 |
| Treasury bills: 2 month | 0.966132 | 0.913679 | 0.936388 | 0.885694 | 0.962590 | 0.953059 | 0.929439 | 0.915759 | 0.904050 | 0.898326 | ... | 1.000000 | 0.999162 | 0.994025 | 0.980071 | 0.587856 | 0.982410 | 0.991599 | -0.768520 | -0.644867 | -0.045324 |
| Treasury bills: 3 month | 0.973636 | 0.925168 | 0.947076 | 0.898744 | 0.970533 | 0.962051 | 0.940004 | 0.926639 | 0.915834 | 0.909881 | ... | 0.999162 | 1.000000 | 0.997109 | 0.986143 | 0.612922 | 0.986454 | 0.995128 | -0.770248 | -0.658274 | -0.048666 |
| Treasury bills: 6 month | 0.987148 | 0.945192 | 0.966219 | 0.918627 | 0.985163 | 0.978696 | 0.960019 | 0.946566 | 0.935894 | 0.926861 | ... | 0.994025 | 0.997109 | 1.000000 | 0.995380 | 0.654033 | 0.989194 | 0.999187 | -0.764929 | -0.682239 | -0.045914 |
| Treasury bills: 1 year | 0.996983 | 0.962425 | 0.983328 | 0.937023 | 0.996038 | 0.992004 | 0.977644 | 0.963724 | 0.953085 | 0.940557 | ... | 0.980071 | 0.986143 | 0.995380 | 1.000000 | 0.709570 | 0.982448 | 0.995893 | -0.761639 | -0.720879 | -0.036478 |
| CPI Inflaction Rate | 0.725500 | 0.739110 | 0.756007 | 0.746162 | 0.734083 | 0.740256 | 0.751887 | 0.735470 | 0.736259 | 0.726525 | ... | 0.587856 | 0.612922 | 0.654033 | 0.709570 | 1.000000 | 0.629513 | 0.666817 | -0.541372 | -0.751794 | 0.017866 |
| Morgage Rate | 0.972380 | 0.920224 | 0.946726 | 0.886569 | 0.969309 | 0.962145 | 0.937894 | 0.920864 | 0.907991 | 0.894927 | ... | 0.982410 | 0.986454 | 0.989194 | 0.982448 | 0.629513 | 1.000000 | 0.987620 | -0.805028 | -0.617461 | -0.090534 |
| Prime Rate | 0.989560 | 0.952833 | 0.971334 | 0.927391 | 0.987723 | 0.982265 | 0.966160 | 0.954083 | 0.943930 | 0.935540 | ... | 0.991599 | 0.995128 | 0.999187 | 0.995893 | 0.666817 | 0.987620 | 1.000000 | -0.762774 | -0.694479 | -0.048387 |
| House Price Index(house and land) | -0.744187 | -0.625150 | -0.693431 | -0.561822 | -0.735073 | -0.720636 | -0.668349 | -0.626895 | -0.591373 | -0.562839 | ... | -0.768520 | -0.770248 | -0.764929 | -0.761639 | -0.541372 | -0.805028 | -0.762774 | 1.000000 | 0.388159 | 0.275270 |
| Unemployment rate | -0.746375 | -0.802838 | -0.785146 | -0.805263 | -0.750504 | -0.765741 | -0.793680 | -0.806113 | -0.811841 | -0.794732 | ... | -0.644867 | -0.658274 | -0.682239 | -0.720879 | -0.751794 | -0.617461 | -0.694479 | 0.388159 | 1.000000 | -0.353492 |
| Real GDP growth Seasonal adjustment | -0.031706 | 0.009622 | -0.017584 | 0.029892 | -0.021169 | -0.014608 | -0.001892 | 0.007730 | 0.030765 | 0.025609 | ... | -0.045324 | -0.048666 | -0.045914 | -0.036478 | 0.017866 | -0.090534 | -0.048387 | 0.275270 | -0.353492 | 1.000000 |
21 rows × 21 columns
print_most_important_economic_factors()
*****************************************************************************************
Principal Components Analysis(PCA) to select Most Important Economic Factors
****************************************************************************************
Principal Components Analysis(PCA) to select Most Important Economic Factors
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | PC14 | PC15 | PC16 | PC17 | PC18 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.85935 | 0.068931 | 0.03383 | 0.024056 | 0.007083 | 0.004152 | 0.001816 | 0.000543 | 0.000109 | 0.000046 | 0.000031 | 0.000021 | 0.000012 | 0.000007 | 0.000005 | 0.000004 | 0.000002 | 4.703571e-34 |
loadings_matrix_df
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | PC14 | PC15 | PC16 | PC17 | PC18 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GOC Marketable Bonds Average Yield: 1-3 year | 0.234760 | 0.026131 | 0.016201 | -0.012893 | -0.024366 | 0.061716 | 0.291109 | -0.139295 | 0.133451 | 0.157802 | 0.515279 | 0.182155 | 0.241015 | -0.101838 | 0.360692 | -0.184581 | -0.003957 | 0.170916 |
| GOC Marketable Bonds Average Yield: 5-10 year | 0.231555 | -0.080234 | -0.095232 | 0.153528 | 0.031926 | 0.220653 | -0.045993 | 0.043579 | -0.190157 | 0.094682 | -0.343397 | -0.213668 | -0.071449 | 0.159755 | 0.474708 | -0.266721 | 0.230909 | -0.350206 |
| GOC Marketable Bonds Average Yield: 3-5 year | 0.234132 | -0.029851 | -0.072510 | 0.042401 | -0.000326 | 0.170944 | 0.217199 | -0.078918 | 0.286689 | -0.068549 | 0.178631 | -0.298389 | -0.702733 | 0.011258 | 0.143179 | 0.156651 | 0.048250 | 0.146997 |
| GOC Marketable Bonds Average Yield: over 10 years | 0.227174 | -0.122381 | -0.136092 | 0.217986 | -0.057406 | 0.134132 | -0.429816 | -0.114250 | 0.444101 | -0.171831 | -0.208001 | 0.320040 | -0.023063 | 0.076492 | 0.068157 | -0.254903 | -0.325177 | 0.221735 |
| GOC benchmark bond yields: 2 year | 0.234779 | 0.013805 | 0.008394 | -0.010948 | -0.061867 | 0.068719 | 0.297248 | -0.173324 | 0.017306 | -0.047074 | -0.160140 | -0.375752 | 0.524676 | 0.397393 | 0.059814 | 0.058629 | -0.202109 | 0.198754 |
| GOC benchmark bond yields: 3 year | 0.234758 | -0.005337 | -0.016723 | 0.008134 | -0.029301 | 0.139984 | 0.287062 | -0.090465 | 0.039591 | 0.137423 | -0.004180 | -0.218928 | -0.028056 | -0.277546 | -0.505661 | -0.334065 | 0.068860 | -0.036051 |
| GOC benchmark bond yields: 5 year | 0.233485 | -0.051348 | -0.076537 | 0.077274 | 0.002426 | 0.185022 | 0.216081 | -0.095769 | -0.199210 | -0.309908 | 0.116107 | 0.498475 | -0.034998 | 0.289604 | -0.361245 | -0.134422 | 0.216742 | -0.104021 |
| GOC benchmark bond yields: 7 year | 0.231707 | -0.078183 | -0.090904 | 0.152294 | 0.065766 | 0.199976 | 0.051898 | 0.010020 | -0.347636 | -0.330405 | -0.001623 | 0.058658 | 0.114303 | -0.590334 | 0.179727 | 0.203702 | -0.278951 | -0.212126 |
| GOC benchmark bond yields: 10 years | 0.230046 | -0.107253 | -0.098440 | 0.189327 | 0.017254 | 0.166038 | -0.119412 | 0.138684 | -0.133025 | 0.001400 | -0.256611 | -0.100427 | 0.049083 | 0.024542 | -0.254798 | 0.375737 | 0.279357 | 0.379159 |
| GOC benchmark bond yields: long term | 0.227639 | -0.108980 | -0.102382 | 0.240735 | -0.052869 | 0.050951 | -0.511097 | -0.023798 | -0.019783 | 0.353407 | 0.502917 | -0.195265 | 0.169617 | 0.019484 | -0.164477 | 0.134039 | 0.065868 | -0.173841 |
| Treasury bills: 1 month | 0.225278 | 0.128380 | 0.259529 | 0.019614 | 0.025262 | -0.352326 | -0.132600 | -0.161266 | 0.086808 | -0.541618 | 0.152998 | -0.267793 | -0.031216 | 0.162620 | -0.103672 | 0.115781 | -0.119419 | -0.296309 |
| Treasury bills: 2 month | 0.226989 | 0.119869 | 0.237662 | 0.011307 | 0.007964 | -0.309849 | -0.107582 | -0.145893 | -0.201502 | -0.078164 | -0.020385 | 0.101861 | -0.005137 | -0.203259 | 0.208998 | 0.083060 | 0.322462 | 0.497798 |
| Treasury bills: 3 month | 0.228918 | 0.111167 | 0.202732 | 0.004675 | -0.015006 | -0.274253 | -0.119133 | -0.032772 | 0.104932 | 0.042636 | -0.161700 | -0.009663 | 0.049030 | -0.144779 | -0.006277 | -0.375099 | 0.378795 | -0.164712 |
| Treasury bills: 6 month | 0.231878 | 0.088394 | 0.144440 | -0.000347 | -0.064215 | -0.175812 | 0.048952 | -0.000216 | -0.009291 | 0.301415 | -0.265119 | -0.029288 | -0.099210 | -0.238959 | -0.209504 | -0.082658 | -0.459009 | 0.058260 |
| Treasury bills: 1 year | 0.234082 | 0.053455 | 0.065061 | -0.036268 | -0.075239 | -0.075118 | 0.192900 | -0.096072 | 0.369251 | 0.282200 | -0.215109 | 0.348370 | 0.066918 | 0.001790 | 0.046062 | 0.547883 | 0.120218 | -0.373485 |
| CPI Inflaction Rate | 0.173762 | -0.180793 | -0.575582 | -0.504796 | -0.493601 | -0.289764 | -0.058610 | 0.010737 | -0.074097 | -0.086223 | -0.010197 | -0.031054 | -0.013493 | -0.045574 | 0.023215 | -0.007326 | 0.039455 | -0.009281 |
| Morgage Rate | 0.227671 | 0.150522 | 0.160041 | -0.041466 | -0.149447 | 0.043348 | 0.053773 | 0.891069 | 0.145476 | -0.116283 | 0.110641 | 0.020574 | 0.072023 | 0.044917 | 0.027882 | -0.039727 | -0.025172 | 0.026488 |
| Prime Rate | 0.232769 | 0.079864 | 0.118729 | 0.003283 | -0.045882 | -0.147697 | -0.017152 | 0.019519 | -0.508541 | 0.292196 | 0.064961 | 0.185499 | -0.310753 | 0.371407 | 0.050503 | -0.003452 | -0.308323 | 0.006737 |
| House Price Index(house and land) | -0.171404 | -0.381353 | -0.085620 | 0.663168 | -0.268190 | -0.450428 | 0.298552 | 0.103229 | 0.016639 | -0.009538 | 0.011201 | -0.002338 | -0.013328 | 0.005792 | 0.027148 | -0.033986 | 0.003009 | -0.002548 |
| Unemployment rate | -0.181182 | 0.436201 | 0.172573 | 0.184132 | -0.757262 | 0.304913 | -0.061990 | -0.167199 | -0.067842 | -0.048952 | -0.015296 | -0.015750 | -0.028583 | -0.042345 | 0.013702 | 0.025426 | 0.041119 | -0.014845 |
| Real GDP growth Seasonal adjustment | -0.003512 | -0.702397 | 0.578009 | -0.267773 | -0.242523 | 0.190165 | -0.055179 | -0.037586 | -0.019916 | -0.010577 | 0.004074 | -0.000240 | -0.015595 | -0.011826 | 0.009457 | 0.008515 | 0.003751 | -0.009416 |
top_components_df
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | PC14 | PC15 | PC16 | PC17 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GOC Marketable Bonds Average Yield: 1-3 year | 0.234760 | 0.026131 | 0.016201 | -0.012893 | -0.024366 | 0.061716 | 0.291109 | -0.139295 | 0.133451 | 0.157802 | 0.515279 | 0.182155 | 0.241015 | -0.101838 | 0.360692 | -0.184581 | -0.003957 |
| GOC Marketable Bonds Average Yield: 5-10 year | 0.231555 | -0.080234 | -0.095232 | 0.153528 | 0.031926 | 0.220653 | -0.045993 | 0.043579 | -0.190157 | 0.094682 | -0.343397 | -0.213668 | -0.071449 | 0.159755 | 0.474708 | -0.266721 | 0.230909 |
| GOC Marketable Bonds Average Yield: 3-5 year | 0.234132 | -0.029851 | -0.072510 | 0.042401 | -0.000326 | 0.170944 | 0.217199 | -0.078918 | 0.286689 | -0.068549 | 0.178631 | -0.298389 | -0.702733 | 0.011258 | 0.143179 | 0.156651 | 0.048250 |
| GOC Marketable Bonds Average Yield: over 10 years | 0.227174 | -0.122381 | -0.136092 | 0.217986 | -0.057406 | 0.134132 | -0.429816 | -0.114250 | 0.444101 | -0.171831 | -0.208001 | 0.320040 | -0.023063 | 0.076492 | 0.068157 | -0.254903 | -0.325177 |
| GOC benchmark bond yields: 2 year | 0.234779 | 0.013805 | 0.008394 | -0.010948 | -0.061867 | 0.068719 | 0.297248 | -0.173324 | 0.017306 | -0.047074 | -0.160140 | -0.375752 | 0.524676 | 0.397393 | 0.059814 | 0.058629 | -0.202109 |
| GOC benchmark bond yields: 3 year | 0.234758 | -0.005337 | -0.016723 | 0.008134 | -0.029301 | 0.139984 | 0.287062 | -0.090465 | 0.039591 | 0.137423 | -0.004180 | -0.218928 | -0.028056 | -0.277546 | -0.505661 | -0.334065 | 0.068860 |
| GOC benchmark bond yields: 5 year | 0.233485 | -0.051348 | -0.076537 | 0.077274 | 0.002426 | 0.185022 | 0.216081 | -0.095769 | -0.199210 | -0.309908 | 0.116107 | 0.498475 | -0.034998 | 0.289604 | -0.361245 | -0.134422 | 0.216742 |
| GOC benchmark bond yields: 7 year | 0.231707 | -0.078183 | -0.090904 | 0.152294 | 0.065766 | 0.199976 | 0.051898 | 0.010020 | -0.347636 | -0.330405 | -0.001623 | 0.058658 | 0.114303 | -0.590334 | 0.179727 | 0.203702 | -0.278951 |
| GOC benchmark bond yields: 10 years | 0.230046 | -0.107253 | -0.098440 | 0.189327 | 0.017254 | 0.166038 | -0.119412 | 0.138684 | -0.133025 | 0.001400 | -0.256611 | -0.100427 | 0.049083 | 0.024542 | -0.254798 | 0.375737 | 0.279357 |
| GOC benchmark bond yields: long term | 0.227639 | -0.108980 | -0.102382 | 0.240735 | -0.052869 | 0.050951 | -0.511097 | -0.023798 | -0.019783 | 0.353407 | 0.502917 | -0.195265 | 0.169617 | 0.019484 | -0.164477 | 0.134039 | 0.065868 |
| Treasury bills: 1 month | 0.225278 | 0.128380 | 0.259529 | 0.019614 | 0.025262 | -0.352326 | -0.132600 | -0.161266 | 0.086808 | -0.541618 | 0.152998 | -0.267793 | -0.031216 | 0.162620 | -0.103672 | 0.115781 | -0.119419 |
| Treasury bills: 2 month | 0.226989 | 0.119869 | 0.237662 | 0.011307 | 0.007964 | -0.309849 | -0.107582 | -0.145893 | -0.201502 | -0.078164 | -0.020385 | 0.101861 | -0.005137 | -0.203259 | 0.208998 | 0.083060 | 0.322462 |
| Treasury bills: 3 month | 0.228918 | 0.111167 | 0.202732 | 0.004675 | -0.015006 | -0.274253 | -0.119133 | -0.032772 | 0.104932 | 0.042636 | -0.161700 | -0.009663 | 0.049030 | -0.144779 | -0.006277 | -0.375099 | 0.378795 |
| Treasury bills: 6 month | 0.231878 | 0.088394 | 0.144440 | -0.000347 | -0.064215 | -0.175812 | 0.048952 | -0.000216 | -0.009291 | 0.301415 | -0.265119 | -0.029288 | -0.099210 | -0.238959 | -0.209504 | -0.082658 | -0.459009 |
| Treasury bills: 1 year | 0.234082 | 0.053455 | 0.065061 | -0.036268 | -0.075239 | -0.075118 | 0.192900 | -0.096072 | 0.369251 | 0.282200 | -0.215109 | 0.348370 | 0.066918 | 0.001790 | 0.046062 | 0.547883 | 0.120218 |
| CPI Inflaction Rate | 0.173762 | -0.180793 | -0.575582 | -0.504796 | -0.493601 | -0.289764 | -0.058610 | 0.010737 | -0.074097 | -0.086223 | -0.010197 | -0.031054 | -0.013493 | -0.045574 | 0.023215 | -0.007326 | 0.039455 |
| Morgage Rate | 0.227671 | 0.150522 | 0.160041 | -0.041466 | -0.149447 | 0.043348 | 0.053773 | 0.891069 | 0.145476 | -0.116283 | 0.110641 | 0.020574 | 0.072023 | 0.044917 | 0.027882 | -0.039727 | -0.025172 |
| Prime Rate | 0.232769 | 0.079864 | 0.118729 | 0.003283 | -0.045882 | -0.147697 | -0.017152 | 0.019519 | -0.508541 | 0.292196 | 0.064961 | 0.185499 | -0.310753 | 0.371407 | 0.050503 | -0.003452 | -0.308323 |
| House Price Index(house and land) | -0.171404 | -0.381353 | -0.085620 | 0.663168 | -0.268190 | -0.450428 | 0.298552 | 0.103229 | 0.016639 | -0.009538 | 0.011201 | -0.002338 | -0.013328 | 0.005792 | 0.027148 | -0.033986 | 0.003009 |
| Unemployment rate | -0.181182 | 0.436201 | 0.172573 | 0.184132 | -0.757262 | 0.304913 | -0.061990 | -0.167199 | -0.067842 | -0.048952 | -0.015296 | -0.015750 | -0.028583 | -0.042345 | 0.013702 | 0.025426 | 0.041119 |
| Real GDP growth Seasonal adjustment | -0.003512 | -0.702397 | 0.578009 | -0.267773 | -0.242523 | 0.190165 | -0.055179 | -0.037586 | -0.019916 | -0.010577 | 0.004074 | -0.000240 | -0.015595 | -0.011826 | 0.009457 | 0.008515 | 0.003751 |
top_indicators_df
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | PC14 | PC15 | PC16 | PC17 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GOC Marketable Bonds Average Yield: 1-3 year | 0.234760 | 0.026131 | 0.016201 | -0.012893 | -0.024366 | 0.061716 | 0.291109 | -0.139295 | 0.133451 | 0.157802 | 0.515279 | 0.182155 | 0.241015 | -0.101838 | 0.360692 | -0.184581 | -0.003957 |
| GOC Marketable Bonds Average Yield: 3-5 year | 0.234132 | -0.029851 | -0.072510 | 0.042401 | -0.000326 | 0.170944 | 0.217199 | -0.078918 | 0.286689 | -0.068549 | 0.178631 | -0.298389 | -0.702733 | 0.011258 | 0.143179 | 0.156651 | 0.048250 |
| GOC benchmark bond yields: 2 year | 0.234779 | 0.013805 | 0.008394 | -0.010948 | -0.061867 | 0.068719 | 0.297248 | -0.173324 | 0.017306 | -0.047074 | -0.160140 | -0.375752 | 0.524676 | 0.397393 | 0.059814 | 0.058629 | -0.202109 |
| GOC benchmark bond yields: 3 year | 0.234758 | -0.005337 | -0.016723 | 0.008134 | -0.029301 | 0.139984 | 0.287062 | -0.090465 | 0.039591 | 0.137423 | -0.004180 | -0.218928 | -0.028056 | -0.277546 | -0.505661 | -0.334065 | 0.068860 |
| GOC benchmark bond yields: 7 year | 0.231707 | -0.078183 | -0.090904 | 0.152294 | 0.065766 | 0.199976 | 0.051898 | 0.010020 | -0.347636 | -0.330405 | -0.001623 | 0.058658 | 0.114303 | -0.590334 | 0.179727 | 0.203702 | -0.278951 |
| GOC benchmark bond yields: long term | 0.227639 | -0.108980 | -0.102382 | 0.240735 | -0.052869 | 0.050951 | -0.511097 | -0.023798 | -0.019783 | 0.353407 | 0.502917 | -0.195265 | 0.169617 | 0.019484 | -0.164477 | 0.134039 | 0.065868 |
| Treasury bills: 1 month | 0.225278 | 0.128380 | 0.259529 | 0.019614 | 0.025262 | -0.352326 | -0.132600 | -0.161266 | 0.086808 | -0.541618 | 0.152998 | -0.267793 | -0.031216 | 0.162620 | -0.103672 | 0.115781 | -0.119419 |
| Treasury bills: 1 year | 0.234082 | 0.053455 | 0.065061 | -0.036268 | -0.075239 | -0.075118 | 0.192900 | -0.096072 | 0.369251 | 0.282200 | -0.215109 | 0.348370 | 0.066918 | 0.001790 | 0.046062 | 0.547883 | 0.120218 |
| CPI Inflaction Rate | 0.173762 | -0.180793 | -0.575582 | -0.504796 | -0.493601 | -0.289764 | -0.058610 | 0.010737 | -0.074097 | -0.086223 | -0.010197 | -0.031054 | -0.013493 | -0.045574 | 0.023215 | -0.007326 | 0.039455 |
| Morgage Rate | 0.227671 | 0.150522 | 0.160041 | -0.041466 | -0.149447 | 0.043348 | 0.053773 | 0.891069 | 0.145476 | -0.116283 | 0.110641 | 0.020574 | 0.072023 | 0.044917 | 0.027882 | -0.039727 | -0.025172 |
| Prime Rate | 0.232769 | 0.079864 | 0.118729 | 0.003283 | -0.045882 | -0.147697 | -0.017152 | 0.019519 | -0.508541 | 0.292196 | 0.064961 | 0.185499 | -0.310753 | 0.371407 | 0.050503 | -0.003452 | -0.308323 |
| House Price Index(house and land) | -0.171404 | -0.381353 | -0.085620 | 0.663168 | -0.268190 | -0.450428 | 0.298552 | 0.103229 | 0.016639 | -0.009538 | 0.011201 | -0.002338 | -0.013328 | 0.005792 | 0.027148 | -0.033986 | 0.003009 |
| Unemployment rate | -0.181182 | 0.436201 | 0.172573 | 0.184132 | -0.757262 | 0.304913 | -0.061990 | -0.167199 | -0.067842 | -0.048952 | -0.015296 | -0.015750 | -0.028583 | -0.042345 | 0.013702 | 0.025426 | 0.041119 |
| Real GDP growth Seasonal adjustment | -0.003512 | -0.702397 | 0.578009 | -0.267773 | -0.242523 | 0.190165 | -0.055179 | -0.037586 | -0.019916 | -0.010577 | 0.004074 | -0.000240 | -0.015595 | -0.011826 | 0.009457 | 0.008515 | 0.003751 |
most_important_economic_factors_df
| GOC Marketable Bonds Average Yield: 1-3 year | GOC Marketable Bonds Average Yield: 3-5 year | GOC benchmark bond yields: 2 year | GOC benchmark bond yields: 3 year | GOC benchmark bond yields: 7 year | GOC benchmark bond yields: long term | Treasury bills: 1 month | Treasury bills: 1 year | CPI Inflaction Rate | Morgage Rate | Prime Rate | House Price Index(house and land) | Unemployment rate | Real GDP growth Seasonal adjustment | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Quarter_Year | ||||||||||||||
| 2020Q1 | 1.2 | 1.1 | 1.1 | 1.1 | 1.1 | 1.4 | 1.3 | 1.2 | 2.0 | 4.0 | 1.8 | 0.2 | 4.6 | -2.1 |
| 2020Q2 | 0.3 | 0.4 | 0.3 | 0.3 | 0.4 | 1.1 | 0.2 | 0.3 | 1.6 | 3.9 | 1.2 | 0.1 | 7.8 | -10.6 |
| 2020Q3 | 0.2 | 0.3 | 0.3 | 0.3 | 0.4 | 1.1 | 0.2 | 0.2 | 1.4 | 3.6 | 1.1 | 0.7 | 5.9 | 8.9 |
| 2020Q4 | 0.2 | 0.4 | 0.2 | 0.3 | 0.5 | 1.2 | 0.1 | 0.2 | 1.7 | 3.4 | 1.0 | 0.5 | 5.2 | 2.1 |
| 2021Q1 | 0.2 | 0.5 | 0.2 | 0.3 | 0.9 | 1.8 | 0.1 | 0.1 | 1.7 | 3.3 | 1.1 | 1.2 | 5.5 | 1.2 |
| 2021Q2 | 0.3 | 0.8 | 0.4 | 0.5 | 1.2 | 2.0 | 0.1 | 0.2 | 2.4 | 3.3 | 1.2 | 1.3 | 5.1 | -0.1 |
| 2021Q3 | 0.4 | 0.8 | 0.5 | 0.6 | 1.1 | 1.8 | 0.2 | 0.3 | 2.9 | 3.2 | 1.2 | 0.5 | 4.6 | 1.6 |
| 2021Q4 | 1.0 | 1.3 | 1.0 | 1.1 | 1.5 | 1.9 | 0.1 | 0.7 | 3.1 | 3.4 | 1.3 | 0.6 | 4.1 | 1.6 |
| 2022Q1 | 1.6 | 1.9 | 1.7 | 1.8 | 2.0 | 2.2 | 0.2 | 1.4 | 4.0 | 3.6 | 1.6 | 1.1 | 4.3 | 0.8 |
| 2022Q2 | 2.7 | 2.8 | 2.7 | 2.8 | 2.8 | 2.9 | 1.1 | 2.6 | 5.3 | 4.6 | 2.5 | 0.3 | 3.7 | 1.1 |
| 2022Q3 | 3.5 | 3.2 | 3.5 | 3.4 | 3.0 | 2.9 | 2.8 | 3.7 | 5.8 | 5.6 | 3.2 | 0.0 | 3.6 | 0.5 |
| 2022Q4 | 3.9 | 3.5 | 3.9 | 3.7 | 3.1 | 3.2 | 3.9 | 4.4 | 6.0 | 5.8 | 3.7 | -0.1 | 3.5 | -0.0 |
| 2023Q1 | 3.9 | 3.3 | 3.8 | 3.6 | 3.0 | 3.1 | 4.3 | 4.4 | 5.9 | 5.8 | 3.9 | -0.2 | 3.8 | 0.6 |
| 2023Q2 | 4.1 | 3.4 | 4.1 | 3.8 | 3.1 | 3.1 | 4.5 | 4.7 | 5.3 | 5.8 | 4.0 | 0.0 | 3.7 | 0.2 |
| 2023Q3 | 4.8 | 4.1 | 4.8 | 4.5 | 3.8 | 3.5 | 4.9 | 5.2 | 4.6 | 6.1 | 4.3 | -0.1 | 3.7 | -0.1 |
| 2023Q4 | 4.3 | 3.7 | 4.3 | 4.1 | 3.6 | 3.4 | 4.9 | 4.8 | 4.0 | 6.4 | 4.3 | -0.1 | 3.6 | 0.1 |
| 2024Q1 | 4.2 | 3.6 | 4.1 | 3.9 | 3.4 | 3.3 | 5.0 | 4.8 | 3.1 | 6.2 | 4.1 | 0.0 | 4.0 | 0.5 |
| 2024Q2 | 4.3 | 3.8 | 4.2 | 4.1 | 3.7 | 3.6 | 4.8 | 4.6 | 2.5 | 6.1 | 4.1 | 0.2 | 4.0 | 0.4 |
most_important_economic_factors_matrix
| GOC Marketable Bonds Average Yield: 1-3 year | GOC Marketable Bonds Average Yield: 3-5 year | GOC benchmark bond yields: 2 year | GOC benchmark bond yields: 3 year | GOC benchmark bond yields: 7 year | GOC benchmark bond yields: long term | Treasury bills: 1 month | Treasury bills: 1 year | CPI Inflaction Rate | Morgage Rate | Prime Rate | House Price Index(house and land) | Unemployment rate | Real GDP growth Seasonal adjustment | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GOC Marketable Bonds Average Yield: 1-3 year | 1.000000 | 0.993209 | 0.999462 | 0.998315 | 0.977915 | 0.952575 | 0.958881 | 0.996983 | 0.725500 | 0.972380 | 0.989560 | -0.744187 | -0.746375 | -0.031706 |
| GOC Marketable Bonds Average Yield: 3-5 year | 0.993209 | 1.000000 | 0.994331 | 0.997760 | 0.993509 | 0.973534 | 0.927301 | 0.983328 | 0.756007 | 0.946726 | 0.971334 | -0.693431 | -0.785146 | -0.017584 |
| GOC benchmark bond yields: 2 year | 0.999462 | 0.994331 | 1.000000 | 0.998947 | 0.979915 | 0.955397 | 0.955022 | 0.996038 | 0.734083 | 0.969309 | 0.987723 | -0.735073 | -0.750504 | -0.021169 |
| GOC benchmark bond yields: 3 year | 0.998315 | 0.997760 | 0.998947 | 1.000000 | 0.986585 | 0.962796 | 0.944663 | 0.992004 | 0.740256 | 0.962145 | 0.982265 | -0.720636 | -0.765741 | -0.014608 |
| GOC benchmark bond yields: 7 year | 0.977915 | 0.993509 | 0.979915 | 0.986585 | 1.000000 | 0.988589 | 0.906120 | 0.963724 | 0.735470 | 0.920864 | 0.954083 | -0.626895 | -0.806113 | 0.007730 |
| GOC benchmark bond yields: long term | 0.952575 | 0.973534 | 0.955397 | 0.962796 | 0.988589 | 1.000000 | 0.889460 | 0.940557 | 0.726525 | 0.894927 | 0.935540 | -0.562839 | -0.794732 | 0.025609 |
| Treasury bills: 1 month | 0.958881 | 0.927301 | 0.955022 | 0.944663 | 0.906120 | 0.889460 | 1.000000 | 0.974308 | 0.569049 | 0.978951 | 0.987231 | -0.765780 | -0.633458 | -0.047291 |
| Treasury bills: 1 year | 0.996983 | 0.983328 | 0.996038 | 0.992004 | 0.963724 | 0.940557 | 0.974308 | 1.000000 | 0.709570 | 0.982448 | 0.995893 | -0.761639 | -0.720879 | -0.036478 |
| CPI Inflaction Rate | 0.725500 | 0.756007 | 0.734083 | 0.740256 | 0.735470 | 0.726525 | 0.569049 | 0.709570 | 1.000000 | 0.629513 | 0.666817 | -0.541372 | -0.751794 | 0.017866 |
| Morgage Rate | 0.972380 | 0.946726 | 0.969309 | 0.962145 | 0.920864 | 0.894927 | 0.978951 | 0.982448 | 0.629513 | 1.000000 | 0.987620 | -0.805028 | -0.617461 | -0.090534 |
| Prime Rate | 0.989560 | 0.971334 | 0.987723 | 0.982265 | 0.954083 | 0.935540 | 0.987231 | 0.995893 | 0.666817 | 0.987620 | 1.000000 | -0.762774 | -0.694479 | -0.048387 |
| House Price Index(house and land) | -0.744187 | -0.693431 | -0.735073 | -0.720636 | -0.626895 | -0.562839 | -0.765780 | -0.761639 | -0.541372 | -0.805028 | -0.762774 | 1.000000 | 0.388159 | 0.275270 |
| Unemployment rate | -0.746375 | -0.785146 | -0.750504 | -0.765741 | -0.806113 | -0.794732 | -0.633458 | -0.720879 | -0.751794 | -0.617461 | -0.694479 | 0.388159 | 1.000000 | -0.353492 |
| Real GDP growth Seasonal adjustment | -0.031706 | -0.017584 | -0.021169 | -0.014608 | 0.007730 | 0.025609 | -0.047291 | -0.036478 | 0.017866 | -0.090534 | -0.048387 | 0.275270 | -0.353492 | 1.000000 |
Macroeconomics KPI Best Case Scenario - Worst Case Scenario and Normal Case Scenario
#under implementation
recalculate Expected return, Standard deviation (risk), and Value-at-Risk (VaR).
#under implementation
Visualizing the impact of the stress scenario on the portfolio can help in understanding the potential risks.
In this section, we will use Decision Tree to model how different scenarios might cascade through the portfolio, affecting asset values, returns, and overall portfolio performance.
#under implementation
#under implementation